Michael Hogge - April 2023
Emotion AI, also known as artificial emotional intelligence, is a subset of artificial intelligence dealing with the detection and replication of human emotion by machines. The successful creation of this "artificial empathy" hinges on a computer's ability to analyze, among other things, human text, speech, and facial expressions. In support of these efforts, this project leverages the power of convolutional neural networks (CNN) to create a computer vision model capable of accurately performing multi-class classification on images containing one of four facial expressions: happy, sad, neutral, and surprise.
Data provided for this project includes over 20,000 grayscale images split into training (75%), validation (24.5%), and test (0.5%) datasets, and further divided into the aforementioned classes. At the outset of the project, a visual analysis of the data is undertaken and a slight imbalance is noted in the class distribution, with 'surprise' images making up a smaller percentage of total images when compared to 'happy,' 'sad,' and 'neutral' images. The unique characteristics of each class are discussed (e.g., images labeled as 'surprise' tend to contain faces with wide open mouths and eyes), including a breakdown of average pixel value by class.
Following the data visualization and analysis phase of the project, nine CNNs are developed, ranging from simple grayscale models to complex transfer learning architectures comprised of hundreds of layers and tens of millions of parameters. Basic models are shown to be lacking the required amount of complexity to properly fit the data, while the transfer learning models (VGG16, ResNet v2, and EfficientNet) are shown to be too complex for the amount and type of data provided for this project. The unsatisfactory performance of the basic and transfer learning models necessitates the development of an alternative model capable of fitting the data and achieving acceptable levels of accuracy while maintaining a high level of generalizability. The proposed model, with four convolutional blocks and 1.8 million parameters, displays high accuracy (75% on training, validation, and test data) when compared to human performance (±65%) on similar data, and avoids overfitting the training data, which can be difficult to achieve with CNNs.
The deployability of this model depends entirely on its intended use. With an accuracy of 75%, deployment in a marketing or gaming setting is perfectly reasonable, assuming consent has been granted, and the handling of highly personal data is done in an ethical, transparent manner with data privacy coming before profit. However, deployment in circumstances where the output from this model could cause serious material damage to an individual (e.g., hiring decisions, law enforcement, evidence in a court of law, etc.) should be avoided. While computer vision models can become quite skilled at classifying human facial expressions (particularly if they are trained on over-emoting/exaggerated images), it is important to note that a connection between those expressions and any underlying emotion is not a hard scientific fact. For example, a smiling person may not always be happy (e.g., they could be uncomfortable or polite), a crying person may not always be sad (e.g., they could be crying tears of joy), and someone who is surprised may be experiencing compound emotions (e.g., happily surprised or sadly surprised).
There is certainly scope to improve the proposed model, including the ethical sourcing of additional, diverse training images, and additional data augmentation on top of what is already performed during the development of the proposed model. In certain scenarios, as indicated above, model deployment could proceed with 75% accuracy, and continued improvement could be pursued by the business/organization/government as time and funding allows. Before model deployment, a set of guiding ethical principles should be developed and adhered to throughout the data collection, analysis, and (possibly) storage phase. Stakeholders must ensure transparency throughout all stages of the computer vision life cycle, while monitoring the overall development of Emotion AI technology and anticipating future regulatory action, which appears likely.
Context:
How do humans communicate with one another? While spoken and written communication may immediately come to mind, research by Dr. Albert Mehrabian has found that over 50% of communication is conveyed through body language, including facial expressions. In face-to-face conversation, body language, it turns out, plays a larger role in how our message is interpreted than both the words we choose, and the tone with which we deliver them. Our expression is a powerful window into our true feelings, and as such, it can be used as a highly-effective proxy for sentiment, particularly in the absence of written or spoken communication.
Emotion AI (artificial emotional intelligence, or affective computing), attempts to leverage this proxy for sentiment by detecting and processing facial expression (through neural networks) in an effort to successfully interpret human emotion and respond appropriately. Developing models that can accurately detect facial emotion is therefore an important driver of advancement in the realm of artificial intelligence and emotionally intelligent machines. The ability to successfully extract sentiment from images and video is also a powerful tool for businesses looking to conjure insights from the troves of unstructured data they have accumulated in recent years, or even to extract second-by-second customer responses to advertisements, store layouts, customer/user experience, etc.
Objective:
The objective of this project is to utilize deep learning techniques, including convolutional neural networks, to create a computer vision model that can accurately detect and interpret facial emotions. This model should be capable of performing multi-class classification on images containing one of four facial expressions: happy, sad, neutral, and surprise.
Key Questions:
The data set consists of 3 folders, i.e., 'test', 'train', and 'validation'. Each of these folders has four subfolders:
‘happy’: Images of people who have happy facial expressions.
‘sad’: Images of people with sad or upset facial expressions.
‘surprise’: Images of people who have shocked or surprised facial expressions.
‘neutral’: Images of people showing no prominent emotion in their facial expression at all.
import zipfile # Used to unzip the data
import numpy as np # Mathematical functions, arrays, etc.
import pandas as pd # Data manipulation and analysis
import os # Misc operating system interfaces
import h5py # Read and write h5py files
import random
import matplotlib.pyplot as plt # A library for data visualization
from matplotlib import image as mpimg # Used to show images from filepath
import seaborn as sns # An advanced library for data visualization
from PIL import Image # Image processing
import cv2 # Image processing
# Importing Deep Learning Libraries, layers, models, optimizers, etc
import tensorflow as tf
from tensorflow.keras.preprocessing.image import load_img, img_to_array, ImageDataGenerator
from tensorflow.keras.layers import Dense, Input, Dropout, SpatialDropout2D, GlobalAveragePooling2D, Flatten, Conv2D, BatchNormalization, Activation, MaxPooling2D, LeakyReLU, GaussianNoise
from tensorflow.keras.models import Model, Sequential
from tensorflow.keras.optimizers import Adam, SGD, RMSprop, Adadelta
from tensorflow.keras import regularizers
from keras.regularizers import l2
from tensorflow.keras.losses import categorical_crossentropy
from tensorflow.keras.utils import to_categorical
import tensorflow.keras.applications as ap
from tensorflow.keras.applications.vgg16 import VGG16
from keras.callbacks import ModelCheckpoint, EarlyStopping, ReduceLROnPlateau
from tensorflow.keras import backend
# Reproducibility within TensorFlow
import fwr13y.d9m.tensorflow as tf_determinism
tf_determinism.enable_determinism()
tf.config.experimental.enable_op_determinism
from tqdm import tqdm # Generates progress bars
# Predictive data analysis tools
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
# To suppress warnings
import warnings
warnings.filterwarnings("ignore")
# Needed to silence tensorflow messages while running locally
from silence_tensorflow import silence_tensorflow
silence_tensorflow()
fwr13y.d9m.tensorflow.enable_determinism (version 0.4.0) has been applied to TensorFlow version 2.9.0
# Fixing the seed for random number generators to ensure reproducibility
np.random.seed(42)
random.seed(42)
tf.random.set_seed(42)
# Ensuring reproducibility using GPU with TensorFlow
os.environ['TF_DETERMINISTIC_OPS'] = '1'
# Extracting image files from the zip file
with zipfile.ZipFile("Facial_emotion_images.zip", "r") as zip_ref:
zip_ref.extractall()
dir_train = "Facial_emotion_images/train/" # Path of training data after unzipping
dir_validation = "Facial_emotion_images/validation/" # Path of validation data after unzipping
dir_test = "Facial_emotion_images/test/" # Path of test data after unzipping
img_size = 48 # Defining the size of the image as 48 pixels
# Custom function to display first 35 images from the specified training folder
def display_emotion(emotion):
train_emotion = dir_train + emotion + "/"
plt.figure(figsize = (11, 11))
for i in range(1, 36):
plt.subplot(5, 7, i)
img = load_img(train_emotion +
os.listdir(train_emotion)[i],
target_size = (img_size, img_size))
plt.imshow(img)
plt.show()
print("These are the first 35 training images labeled as 'Happy':")
display_emotion("happy")
These are the first 35 training images labeled as 'Happy':
# An example image pulled from the images above
img_x = os.listdir("Facial_emotion_images/train/happy/")[16]
img_happy_16 = mpimg.imread("Facial_emotion_images/train/happy/"+img_x)
plt.figure(figsize = (2, 2))
plt.imshow(img_happy_16, cmap='Greys_r')
plt.show()
Observations and Insights: Happy
print("These are the first 35 training images labeled as 'Sad':")
display_emotion("sad")
These are the first 35 training images labeled as 'Sad':
# An example image pulled from the images above
img_x = os.listdir("Facial_emotion_images/train/sad/")[7]
img_sad_7 = mpimg.imread("Facial_emotion_images/train/sad/"+img_x)
plt.figure(figsize = (2, 2))
plt.imshow(img_sad_7, cmap='Greys_r')
plt.show()
Observations and Insights: Sad
print("These are the first 35 training images labeled as 'Neutral':")
display_emotion("neutral")
These are the first 35 training images labeled as 'Neutral':
# An example image pulled from the images above
img_x = os.listdir("Facial_emotion_images/train/neutral/")[26]
img_neutral_26 = mpimg.imread("Facial_emotion_images/train/neutral/"+img_x)
plt.figure(figsize = (2, 2))
plt.imshow(img_neutral_26, cmap='Greys_r')
plt.show()
Observations and Insights: Neutral
print("These are the first 35 training images labeled as 'Surprise':")
display_emotion("surprise")
These are the first 35 training images labeled as 'Surprise':
# An example image pulled from the images above
img_x = os.listdir("Facial_emotion_images/train/surprise/")[17]
img_surprise_34 = mpimg.imread("Facial_emotion_images/train/surprise/"+img_x)
plt.figure(figsize = (2, 2))
plt.imshow(img_surprise_34, cmap='Greys_r')
plt.show()
Observations and Insights: Surprise
Overall Insights from Visualization of Classes:
# Getting the count of images in each training folder and saving to variables
train_happy = len(os.listdir(dir_train + "happy/"))
train_sad = len(os.listdir(dir_train + "sad/"))
train_neutral = len(os.listdir(dir_train + "neutral/"))
train_surprised = len(os.listdir(dir_train + "surprise/"))
# Creating a Pandas series called "train_series" and converting to Pandas dataframe called "train_df"
# in order to display the table below. The dataframe will also contribute to bar charts farther below.
train_series = pd.Series({'Happy': train_happy, 'Sad': train_sad, 'Neutral': train_neutral,
'Surprised': train_surprised})
train_df = pd.DataFrame(train_series, columns = ['Total Training Images'])
train_df["Percentage"] = round((train_df["Total Training Images"] / train_df["Total Training Images"].sum())*100, 1)
train_df.index.name='Emotions'
print("The distribution of classes within the training data:")
train_df
The distribution of classes within the training data:
| Total Training Images | Percentage | |
|---|---|---|
| Emotions | ||
| Happy | 3976 | 26.3 |
| Sad | 3982 | 26.4 |
| Neutral | 3978 | 26.3 |
| Surprised | 3173 | 21.0 |
train_df.sum()
Total Training Images 15109.0 Percentage 100.0 dtype: float64
Observations: Training Images
# Getting count of images in each validation folder and saving to variables
val_happy = len(os.listdir(dir_validation + "happy/"))
val_sad = len(os.listdir(dir_validation + "sad/"))
val_neutral = len(os.listdir(dir_validation + "neutral/"))
val_surprised = len(os.listdir(dir_validation + "surprise/"))
# Creating a Pandas series called "val_series" and converting to Pandas dataframe called "val_df"
# in order to display the table below. The dataframe will also contribute to bar charts farther below.
val_series = pd.Series({'Happy': val_happy, 'Sad': val_sad, 'Neutral': val_neutral,
'Surprised': val_surprised})
val_df = pd.DataFrame(val_series, columns = ['Total Validation Images'])
val_df["Percentage"] = round((val_df["Total Validation Images"] / val_df["Total Validation Images"].sum())*100, 1)
val_df.index.name='Emotions'
print("The distribution of classes within the validation data:")
val_df
The distribution of classes within the validation data:
| Total Validation Images | Percentage | |
|---|---|---|
| Emotions | ||
| Happy | 1825 | 36.7 |
| Sad | 1139 | 22.9 |
| Neutral | 1216 | 24.4 |
| Surprised | 797 | 16.0 |
val_df.sum()
Total Validation Images 4977.0 Percentage 100.0 dtype: float64
Observations: Validation Images
# Getting count of images in each test folder and saving to variables
test_happy = len(os.listdir(dir_test + "happy/"))
test_sad = len(os.listdir(dir_test + "sad/"))
test_neutral = len(os.listdir(dir_test + "neutral/"))
test_surprised = len(os.listdir(dir_test + "surprise/"))
# Creating a Pandas series called "test_series" and converting to Pandas dataframe called "test_df"
# in order to display the table below. The dataframe will also contribute to bar charts farther below.
test_series = pd.Series({'Happy': test_happy, 'Sad': test_sad, 'Neutral': test_neutral,
'Surprised': test_surprised})
test_df = pd.DataFrame(test_series, columns = ['Total Test Images'])
test_df["Percentage"] = round((test_df["Total Test Images"] / test_df["Total Test Images"].sum())*100, 1)
test_df.index.name='Emotions'
print("The distribution of classes within the validation data:")
test_df
The distribution of classes within the validation data:
| Total Test Images | Percentage | |
|---|---|---|
| Emotions | ||
| Happy | 32 | 25.0 |
| Sad | 32 | 25.0 |
| Neutral | 32 | 25.0 |
| Surprised | 32 | 25.0 |
test_df.sum()
Total Test Images 128.0 Percentage 100.0 dtype: float64
Observations: Test Images
# Concatenating train_df, val_df, and test_df to create "df_total" in order to create the chart below
df_total = pd.concat([train_df, val_df, test_df], axis=1)
df_total.drop(['Percentage'], axis=1, inplace=True)
df_total = df_total.reset_index()
df_total.rename(columns={"index":"Emotions", "Total Training Images":"Train",
"Total Validation Images":"Validate", "Total Test Images":"Test"}, inplace=True)
# Creating bar chart below, grouped by class (i.e. 'emotion') and broken down into "train", "validate",
# and "test" data. The x-axis is Emotions and the y-axis is Total Images.
df_total.groupby("Emotions", sort=False).mean().plot(kind='bar', figsize=(10,5),
title="TOTAL TRAINING, VALIDATION and TEST IMAGES",
ylabel="Total Images", rot=0, fontsize=12, width=0.9, colormap="Pastel2",
edgecolor='black')
plt.show()
Observations:
# Concatenating train_df, val_df, and test_df to create "df_percent" in order to create the chart below
df_percent = pd.concat([train_df, val_df, test_df], axis=1)
df_percent.drop(['Total Training Images', 'Total Validation Images', 'Total Test Images'], axis=1, inplace=True)
df_percent.columns = ["Train", "Validate", "Test"]
# Creating bar chart below, grouped by class (i.e. 'emotion') and broken down into "train", "validate",
# and "test" data. The x-axis is Emotions and the y-axis is Percentage of Total Images.
df_percent.groupby("Emotions", sort=False).mean().plot(kind='bar', figsize=(10,5),
title="PERCENTAGE OF TOTAL TRAINING, VALIDATION and TEST IMAGES",
ylabel="Percentage of Total Images", rot=0, fontsize=12, width=0.9, colormap="Pastel2",
edgecolor='black')
plt.show()
Observations:
# Obtaining the average pixel value for training images in the class 'Happy'
list_train_happy = []
for i in range(len(os.listdir("Facial_emotion_images/train/happy/"))):
list_x = []
x = os.listdir("Facial_emotion_images/train/happy/")[i]
im = Image.open("Facial_emotion_images/train/happy/"+x, 'r')
pix_val = list(im.getdata())
for j in range(len(pix_val)):
list_x.append(pix_val[j])
list_train_happy.append(sum(list_x)/len(pix_val))
train_happy_pixel_avg = round(sum(list_train_happy)/len(list_train_happy), 2)
# Obtaining the average pixel value for validation images in the class 'Happy'
list_val_happy = []
for i in range(len(os.listdir("Facial_emotion_images/validation/happy/"))):
list_x = []
x = os.listdir("Facial_emotion_images/validation/happy/")[i]
im = Image.open("Facial_emotion_images/validation/happy/"+x, 'r')
pix_val = list(im.getdata())
for j in range(len(pix_val)):
list_x.append(pix_val[j])
list_val_happy.append(sum(list_x)/len(pix_val))
val_happy_pixel_avg = round(sum(list_val_happy)/len(list_val_happy), 2)
# Obtaining the average pixel value for test images in the class 'Happy'
list_test_happy = []
for i in range(len(os.listdir("Facial_emotion_images/test/happy/"))):
list_x = []
x = os.listdir("Facial_emotion_images/test/happy/")[i]
im = Image.open("Facial_emotion_images/test/happy/"+x, 'r')
pix_val = list(im.getdata())
for j in range(len(pix_val)):
list_x.append(pix_val[j])
list_test_happy.append(sum(list_x)/len(pix_val))
test_happy_pixel_avg = round(sum(list_test_happy)/len(list_test_happy), 2)
# Obtaining the average pixel value for training images in the class 'Sad'
list_train_sad = []
for i in range(len(os.listdir("Facial_emotion_images/train/sad/"))):
list_x = []
x = os.listdir("Facial_emotion_images/train/sad/")[i]
im = Image.open("Facial_emotion_images/train/sad/"+x, 'r')
pix_val = list(im.getdata())
for j in range(len(pix_val)):
list_x.append(pix_val[j])
list_train_sad.append(sum(list_x)/len(pix_val))
train_sad_pixel_avg = round(sum(list_train_sad)/len(list_train_sad), 2)
# Obtaining the average pixel value for validation images in the class 'Sad'
list_val_sad = []
for i in range(len(os.listdir("Facial_emotion_images/validation/sad/"))):
list_x = []
x = os.listdir("Facial_emotion_images/validation/sad/")[i]
im = Image.open("Facial_emotion_images/validation/sad/"+x, 'r')
pix_val = list(im.getdata())
for j in range(len(pix_val)):
list_x.append(pix_val[j])
list_val_sad.append(sum(list_x)/len(pix_val))
val_sad_pixel_avg = round(sum(list_val_sad)/len(list_val_sad), 2)
# Obtaining the average pixel value for test images in the class 'Sad'
list_test_sad = []
for i in range(len(os.listdir("Facial_emotion_images/test/sad/"))):
list_x = []
x = os.listdir("Facial_emotion_images/test/sad/")[i]
im = Image.open("Facial_emotion_images/test/sad/"+x, 'r')
pix_val = list(im.getdata())
for j in range(len(pix_val)):
list_x.append(pix_val[j])
list_test_sad.append(sum(list_x)/len(pix_val))
test_sad_pixel_avg = round(sum(list_test_sad)/len(list_test_sad), 2)
# Obtaining the average pixel value for training images in the class 'Neutral'
list_train_neutral = []
for i in range(len(os.listdir("Facial_emotion_images/train/neutral/"))):
list_x = []
x = os.listdir("Facial_emotion_images/train/neutral/")[i]
im = Image.open("Facial_emotion_images/train/neutral/"+x, 'r')
pix_val = list(im.getdata())
for j in range(len(pix_val)):
list_x.append(pix_val[j])
list_train_neutral.append(sum(list_x)/len(pix_val))
train_neutral_pixel_avg = round(sum(list_train_neutral)/len(list_train_neutral), 2)
# Obtaining the average pixel value for validation images in the class 'Neutral'
list_val_neutral = []
for i in range(len(os.listdir("Facial_emotion_images/validation/neutral/"))):
list_x = []
x = os.listdir("Facial_emotion_images/validation/neutral/")[i]
im = Image.open("Facial_emotion_images/validation/neutral/"+x, 'r')
pix_val = list(im.getdata())
for j in range(len(pix_val)):
list_x.append(pix_val[j])
list_val_neutral.append(sum(list_x)/len(pix_val))
val_neutral_pixel_avg = round(sum(list_val_neutral)/len(list_val_neutral), 2)
# Obtaining the average pixel value for test images in the class 'Neutral'
list_test_neutral = []
for i in range(len(os.listdir("Facial_emotion_images/test/neutral/"))):
list_x = []
x = os.listdir("Facial_emotion_images/test/neutral/")[i]
im = Image.open("Facial_emotion_images/test/neutral/"+x, 'r')
pix_val = list(im.getdata())
for j in range(len(pix_val)):
list_x.append(pix_val[j])
list_test_neutral.append(sum(list_x)/len(pix_val))
test_neutral_pixel_avg = round(sum(list_test_neutral)/len(list_test_neutral), 2)
# Obtaining the average pixel value for training images in the class 'Surprise'
list_train_surprise = []
for i in range(len(os.listdir("Facial_emotion_images/train/surprise/"))):
list_x = []
x = os.listdir("Facial_emotion_images/train/surprise/")[i]
im = Image.open("Facial_emotion_images/train/surprise/"+x, 'r')
pix_val = list(im.getdata())
for j in range(len(pix_val)):
list_x.append(pix_val[j])
list_train_surprise.append(sum(list_x)/len(pix_val))
train_surprise_pixel_avg = round(sum(list_train_surprise)/len(list_train_surprise), 2)
# Obtaining the average pixel value for validation images in the class 'Surprise'
list_val_surprise = []
for i in range(len(os.listdir("Facial_emotion_images/validation/surprise/"))):
list_x = []
x = os.listdir("Facial_emotion_images/validation/surprise/")[i]
im = Image.open("Facial_emotion_images/validation/surprise/"+x, 'r')
pix_val = list(im.getdata())
for j in range(len(pix_val)):
list_x.append(pix_val[j])
list_val_surprise.append(sum(list_x)/len(pix_val))
val_surprise_pixel_avg = round(sum(list_val_surprise)/len(list_val_surprise), 2)
# Obtaining the average pixel value for test images in the class 'Surprise'
list_test_surprise = []
for i in range(len(os.listdir("Facial_emotion_images/test/surprise/"))):
list_x = []
x = os.listdir("Facial_emotion_images/test/surprise/")[i]
im = Image.open("Facial_emotion_images/test/surprise/"+x, 'r')
pix_val = list(im.getdata())
for j in range(len(pix_val)):
list_x.append(pix_val[j])
list_test_surprise.append(sum(list_x)/len(pix_val))
test_surprise_pixel_avg = round(sum(list_test_surprise)/len(list_test_surprise), 2)
# creating dictionary containing average pixel values by class
dict_pixel_avg = {
"Emotion": ["Happy", "Sad", "Neutral", "Surprise"],
"Train": [train_happy_pixel_avg, train_sad_pixel_avg, train_neutral_pixel_avg, train_surprise_pixel_avg],
"Validate": [val_happy_pixel_avg, val_sad_pixel_avg, val_neutral_pixel_avg, val_surprise_pixel_avg],
"Test": [test_happy_pixel_avg, test_sad_pixel_avg, test_neutral_pixel_avg, test_surprise_pixel_avg]}
# converting dictionary to dataframe
df_pixel_avg = pd.DataFrame.from_dict(dict_pixel_avg)
df_pixel_avg
| Emotion | Train | Validate | Test | |
|---|---|---|---|---|
| 0 | Happy | 128.92 | 129.27 | 134.07 |
| 1 | Sad | 121.10 | 120.25 | 125.68 |
| 2 | Neutral | 124.09 | 123.92 | 127.68 |
| 3 | Surprise | 145.78 | 148.32 | 144.59 |
# plotting pixel averages for training, validation and test images
df_pixel_avg.groupby("Emotion", sort=False).mean().plot(kind='bar', figsize=(10,5),
title="PIXEL AVERAGES FOR TRAINING, VALIDATION and TEST IMAGES",
ylabel="Pixel Averages", rot=0, fontsize=12, width=0.9, colormap="Pastel2",
edgecolor='black')
plt.legend(loc=(1.01, 0.5))
plt.show()
Observations: Pixel Values
Note:
Data pre-processing and augmentation will take place during the creation of data loaders. When ImageDataGenerator objects are instantiated, a range of processes can and will be applied, sometimes to varying degrees, depending on the model being created and trained. Some process/augmentation operations include the following:
While creating our data sets via flow_from_directory, we have an opportunity to set class_mode to 'categorical', which will essentially one-hot-encode our classes. The classes themselves are then defined as 'happy,' 'sad,' 'neutral,' and 'surprise.' This allows us to set our loss to categorical_crossentropy, which itself is used for multi-class classification where each image (in our case) belongs to a single class.
Creating data loaders that we will use as inputs to our initial neural networks. We will create separate data loaders for color_modes grayscale and RGB so we can compare the results. An image that is grayscale has only 1 channel, with pixel values ranging from 0 to 255, while an RGB image has 3 channels, with each pixel having a value for red, green, and blue. Images that are RGB are therefore more complex for a neural network to process.
batch_size = 32
# Creating ImageDataGenerator objects for grayscale colormode
datagen_train_grayscale = ImageDataGenerator(horizontal_flip = True,
brightness_range = (0.,2.),
rescale = 1./255,
shear_range = 0.3)
datagen_validation_grayscale = ImageDataGenerator(horizontal_flip = True,
brightness_range = (0.,2.),
rescale = 1./255,
shear_range = 0.3)
datagen_test_grayscale = ImageDataGenerator(horizontal_flip = True,
brightness_range = (0.,2.),
rescale = 1./255,
shear_range = 0.3)
# Creating ImageDataGenerator objects for RGB colormode
datagen_train_rgb = ImageDataGenerator(horizontal_flip = True,
brightness_range = (0.,2.),
rescale = 1./255,
shear_range = 0.3)
datagen_validation_rgb = ImageDataGenerator(horizontal_flip = True,
brightness_range = (0.,2.),
rescale = 1./255,
shear_range = 0.3)
datagen_test_rgb = ImageDataGenerator(horizontal_flip = True,
brightness_range = (0.,2.),
rescale = 1./255,
shear_range = 0.3)
# Creating train, validation, and test sets for grayscale colormode
print("Grayscale Images")
train_set_grayscale = datagen_train_grayscale.flow_from_directory(dir_train,
target_size = (img_size, img_size),
color_mode = "grayscale",
batch_size = batch_size,
class_mode = 'categorical',
classes = ['happy', 'sad', 'neutral', 'surprise'],
seed = 42,
shuffle = True)
val_set_grayscale = datagen_validation_grayscale.flow_from_directory(dir_validation,
target_size = (img_size, img_size),
color_mode = "grayscale",
batch_size = batch_size,
class_mode = 'categorical',
classes = ['happy', 'sad', 'neutral', 'surprise'],
seed = 42,
shuffle = False)
test_set_grayscale = datagen_test_grayscale.flow_from_directory(dir_test,
target_size = (img_size, img_size),
color_mode = "grayscale",
batch_size = batch_size,
class_mode = 'categorical',
classes = ['happy', 'sad', 'neutral', 'surprise'],
seed = 42,
shuffle = False)
# Creating train, validation, and test sets for RGB colormode
print("\nColor Images")
train_set_rgb = datagen_train_rgb.flow_from_directory(dir_train,
target_size = (img_size, img_size),
color_mode = "rgb",
batch_size = batch_size,
class_mode = 'categorical',
classes = ['happy', 'sad', 'neutral', 'surprise'],
seed = 42,
shuffle = True)
val_set_rgb = datagen_validation_rgb.flow_from_directory(dir_validation,
target_size = (img_size, img_size),
color_mode = "rgb",
batch_size = batch_size,
class_mode = 'categorical',
classes = ['happy', 'sad', 'neutral', 'surprise'],
seed = 42,
shuffle = False)
test_set_rgb = datagen_test_rgb.flow_from_directory(dir_test,
target_size = (img_size, img_size),
color_mode = "rgb",
batch_size = batch_size,
class_mode = 'categorical',
classes = ['happy', 'sad', 'neutral', 'surprise'],
seed = 42,
shuffle = False)
Grayscale Images Found 15109 images belonging to 4 classes. Found 4977 images belonging to 4 classes. Found 128 images belonging to 4 classes. Color Images Found 15109 images belonging to 4 classes. Found 4977 images belonging to 4 classes. Found 128 images belonging to 4 classes.
Note:
Data augmentation performed on the data for these initial models includes horizontal_flip, brightness_range, rescale, and shear_range.
A Note About Neural Networks:
The best algorithmic tools we have available to us for processing images are neural networks. In particular, convolutional neural networks (CNN) have significant advantages over standard artificial neural networks (ANN).
While image classification utilizing ANNs is possible, there are some drawbacks:
On the other hand, through the use of convolutional and pooling layers, CNNs are translationally and spatially invariant. They are able to understand that the location of an object within an image is not important, nor is the background of the image itself. CNNs, through the use of their convolutional layers, are also better able to extract important features of an object within an image. Finally, CNNs take advantage of weight sharing, as the same filters are applied to each area of the image. This reduces the number of weights that need to be learned through backpropagation, thereby minimizing the number of trainable parameters and reducing computational expense.
Taking all of this into account, we will proceed with the development of CNN models to pursue our objectives.
Note:
We will begin by building a simple CNN model to serve as a baseline for future models. The same model will be built with color_mode set to grayscale (with an input shape of 48,48,1) as well as color_mode set to RGB (with an input shape of 48,48,3). The models will then be compared to determine if one approach outperforms the other.
A baseline grayscale model is developed first. It consists of three convolutional blocks with relu activation, MaxPooling and a Dropout layer, followed by a single dense layer with 512 neurons, and a softmax classifier for multi-class classification. Total trainable parameters are 605,060.
# Creating a Sequential model
model_1_grayscale = Sequential()
# Convolutional Block #1
model_1_grayscale.add(Conv2D(64, (2, 2), input_shape = (48, 48, 1), activation='relu', padding = 'same'))
model_1_grayscale.add(MaxPooling2D(2, 2))
model_1_grayscale.add(Dropout(0.2))
# Convolutional Block #2
model_1_grayscale.add(Conv2D(32, (2, 2), activation='relu', padding = 'same'))
model_1_grayscale.add(MaxPooling2D(2, 2))
model_1_grayscale.add(Dropout(0.2))
# Convolutional Block #3
model_1_grayscale.add(Conv2D(32, (2, 2), activation='relu', padding = 'same'))
model_1_grayscale.add(MaxPooling2D(2, 2))
model_1_grayscale.add(Dropout(0.2))
# Flatten layer
model_1_grayscale.add(Flatten())
# Dense layer
model_1_grayscale.add(Dense(512, activation = 'relu'))
# Classifier
model_1_grayscale.add(Dense(4, activation = 'softmax'))
model_1_grayscale.summary()
Metal device set to: Apple M1 Pro
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 48, 48, 64) 320
max_pooling2d (MaxPooling2D (None, 24, 24, 64) 0
)
dropout (Dropout) (None, 24, 24, 64) 0
conv2d_1 (Conv2D) (None, 24, 24, 32) 8224
max_pooling2d_1 (MaxPooling (None, 12, 12, 32) 0
2D)
dropout_1 (Dropout) (None, 12, 12, 32) 0
conv2d_2 (Conv2D) (None, 12, 12, 32) 4128
max_pooling2d_2 (MaxPooling (None, 6, 6, 32) 0
2D)
dropout_2 (Dropout) (None, 6, 6, 32) 0
flatten (Flatten) (None, 1152) 0
dense (Dense) (None, 512) 590336
dense_1 (Dense) (None, 4) 2052
=================================================================
Total params: 605,060
Trainable params: 605,060
Non-trainable params: 0
_________________________________________________________________
# Creating a checkpoint which saves model weights from the best epoch
checkpoint = ModelCheckpoint("./model_1_grayscale.h5", monitor='val_accuracy', verbose=1, save_best_only=True, mode='auto')
# Initiates early stopping if validation loss does not continue to improve
early_stopping = EarlyStopping(monitor = 'val_loss',
min_delta = 0,
patience = 5,
verbose = 1,
restore_best_weights = True)
# Initiates reduced learning rate if validation loss does not continue to improve
reduce_learningrate = ReduceLROnPlateau(monitor = 'val_loss',
factor = 0.2,
patience = 3,
verbose = 1,
min_delta = 0.0001)
callbacks_list = [checkpoint, early_stopping, reduce_learningrate]
# Compiling model with optimizer set to Adam, loss set to categorical_crossentropy, and metrics set to accuracy
model_1_grayscale.compile(optimizer = Adam(learning_rate = 0.001), loss = 'categorical_crossentropy', metrics = ['accuracy'])
# Fitting model with epochs set to 100
history_1_grayscale = model_1_grayscale.fit(train_set_grayscale, validation_data = val_set_grayscale, epochs = 100, callbacks = callbacks_list)
Epoch 1/100 472/473 [============================>.] - ETA: 0s - loss: 1.3553 - accuracy: 0.3082 Epoch 1: val_accuracy improved from -inf to 0.40446, saving model to ./model_1_grayscale.h5 473/473 [==============================] - 24s 49ms/step - loss: 1.3552 - accuracy: 0.3080 - val_loss: 1.2486 - val_accuracy: 0.4045 - lr: 0.0010 Epoch 2/100 472/473 [============================>.] - ETA: 0s - loss: 1.1951 - accuracy: 0.4636 Epoch 2: val_accuracy improved from 0.40446 to 0.53546, saving model to ./model_1_grayscale.h5 473/473 [==============================] - 19s 41ms/step - loss: 1.1945 - accuracy: 0.4639 - val_loss: 1.0981 - val_accuracy: 0.5355 - lr: 0.0010 Epoch 3/100 473/473 [==============================] - ETA: 0s - loss: 1.1111 - accuracy: 0.5118 Epoch 3: val_accuracy did not improve from 0.53546 473/473 [==============================] - 20s 43ms/step - loss: 1.1111 - accuracy: 0.5118 - val_loss: 1.0725 - val_accuracy: 0.5347 - lr: 0.0010 Epoch 4/100 472/473 [============================>.] - ETA: 0s - loss: 1.0661 - accuracy: 0.5354 Epoch 4: val_accuracy improved from 0.53546 to 0.57645, saving model to ./model_1_grayscale.h5 473/473 [==============================] - 20s 43ms/step - loss: 1.0658 - accuracy: 0.5358 - val_loss: 1.0028 - val_accuracy: 0.5765 - lr: 0.0010 Epoch 5/100 472/473 [============================>.] - ETA: 0s - loss: 1.0311 - accuracy: 0.5511 Epoch 5: val_accuracy improved from 0.57645 to 0.59755, saving model to ./model_1_grayscale.h5 473/473 [==============================] - 20s 41ms/step - loss: 1.0311 - accuracy: 0.5512 - val_loss: 0.9697 - val_accuracy: 0.5975 - lr: 0.0010 Epoch 6/100 472/473 [============================>.] - ETA: 0s - loss: 0.9950 - accuracy: 0.5684 Epoch 6: val_accuracy improved from 0.59755 to 0.60076, saving model to ./model_1_grayscale.h5 473/473 [==============================] - 20s 41ms/step - loss: 0.9951 - accuracy: 0.5685 - val_loss: 0.9454 - val_accuracy: 0.6008 - lr: 0.0010 Epoch 7/100 473/473 [==============================] - ETA: 0s - loss: 0.9724 - accuracy: 0.5772 Epoch 7: val_accuracy did not improve from 0.60076 473/473 [==============================] - 20s 42ms/step - loss: 0.9724 - accuracy: 0.5772 - val_loss: 0.9970 - val_accuracy: 0.5859 - lr: 0.0010 Epoch 8/100 473/473 [==============================] - ETA: 0s - loss: 0.9494 - accuracy: 0.5920 Epoch 8: val_accuracy improved from 0.60076 to 0.62166, saving model to ./model_1_grayscale.h5 473/473 [==============================] - 20s 41ms/step - loss: 0.9494 - accuracy: 0.5920 - val_loss: 0.9020 - val_accuracy: 0.6217 - lr: 0.0010 Epoch 9/100 473/473 [==============================] - ETA: 0s - loss: 0.9273 - accuracy: 0.5985 Epoch 9: val_accuracy improved from 0.62166 to 0.63030, saving model to ./model_1_grayscale.h5 473/473 [==============================] - 21s 44ms/step - loss: 0.9273 - accuracy: 0.5985 - val_loss: 0.8958 - val_accuracy: 0.6303 - lr: 0.0010 Epoch 10/100 473/473 [==============================] - ETA: 0s - loss: 0.9179 - accuracy: 0.6088 Epoch 10: val_accuracy did not improve from 0.63030 473/473 [==============================] - 19s 40ms/step - loss: 0.9179 - accuracy: 0.6088 - val_loss: 0.9093 - val_accuracy: 0.6162 - lr: 0.0010 Epoch 11/100 472/473 [============================>.] - ETA: 0s - loss: 0.8932 - accuracy: 0.6189 Epoch 11: val_accuracy improved from 0.63030 to 0.63814, saving model to ./model_1_grayscale.h5 473/473 [==============================] - 19s 41ms/step - loss: 0.8934 - accuracy: 0.6188 - val_loss: 0.8742 - val_accuracy: 0.6381 - lr: 0.0010 Epoch 12/100 473/473 [==============================] - ETA: 0s - loss: 0.8795 - accuracy: 0.6289 Epoch 12: val_accuracy improved from 0.63814 to 0.64517, saving model to ./model_1_grayscale.h5 473/473 [==============================] - 20s 42ms/step - loss: 0.8795 - accuracy: 0.6289 - val_loss: 0.8668 - val_accuracy: 0.6452 - lr: 0.0010 Epoch 13/100 472/473 [============================>.] - ETA: 0s - loss: 0.8640 - accuracy: 0.6320 Epoch 13: val_accuracy did not improve from 0.64517 473/473 [==============================] - 20s 41ms/step - loss: 0.8645 - accuracy: 0.6319 - val_loss: 0.8784 - val_accuracy: 0.6319 - lr: 0.0010 Epoch 14/100 472/473 [============================>.] - ETA: 0s - loss: 0.8565 - accuracy: 0.6369 Epoch 14: val_accuracy did not improve from 0.64517 473/473 [==============================] - 19s 41ms/step - loss: 0.8562 - accuracy: 0.6369 - val_loss: 0.8624 - val_accuracy: 0.6448 - lr: 0.0010 Epoch 15/100 472/473 [============================>.] - ETA: 0s - loss: 0.8432 - accuracy: 0.6456 Epoch 15: val_accuracy did not improve from 0.64517 473/473 [==============================] - 20s 43ms/step - loss: 0.8432 - accuracy: 0.6454 - val_loss: 0.8713 - val_accuracy: 0.6416 - lr: 0.0010 Epoch 16/100 472/473 [============================>.] - ETA: 0s - loss: 0.8317 - accuracy: 0.6504 Epoch 16: val_accuracy did not improve from 0.64517 473/473 [==============================] - 19s 41ms/step - loss: 0.8318 - accuracy: 0.6502 - val_loss: 0.8629 - val_accuracy: 0.6418 - lr: 0.0010 Epoch 17/100 472/473 [============================>.] - ETA: 0s - loss: 0.8108 - accuracy: 0.6595 Epoch 17: val_accuracy improved from 0.64517 to 0.66044, saving model to ./model_1_grayscale.h5 473/473 [==============================] - 20s 41ms/step - loss: 0.8105 - accuracy: 0.6597 - val_loss: 0.8239 - val_accuracy: 0.6604 - lr: 0.0010 Epoch 18/100 473/473 [==============================] - ETA: 0s - loss: 0.8060 - accuracy: 0.6621 Epoch 18: val_accuracy improved from 0.66044 to 0.66486, saving model to ./model_1_grayscale.h5 473/473 [==============================] - 20s 41ms/step - loss: 0.8060 - accuracy: 0.6621 - val_loss: 0.8214 - val_accuracy: 0.6649 - lr: 0.0010 Epoch 19/100 473/473 [==============================] - ETA: 0s - loss: 0.7970 - accuracy: 0.6670 Epoch 19: val_accuracy improved from 0.66486 to 0.67008, saving model to ./model_1_grayscale.h5 473/473 [==============================] - 20s 41ms/step - loss: 0.7970 - accuracy: 0.6670 - val_loss: 0.8127 - val_accuracy: 0.6701 - lr: 0.0010 Epoch 20/100 473/473 [==============================] - ETA: 0s - loss: 0.7853 - accuracy: 0.6721 Epoch 20: val_accuracy did not improve from 0.67008 473/473 [==============================] - 19s 40ms/step - loss: 0.7853 - accuracy: 0.6721 - val_loss: 0.8171 - val_accuracy: 0.6604 - lr: 0.0010 Epoch 21/100 472/473 [============================>.] - ETA: 0s - loss: 0.7768 - accuracy: 0.6808 Epoch 21: val_accuracy did not improve from 0.67008 473/473 [==============================] - 20s 42ms/step - loss: 0.7769 - accuracy: 0.6805 - val_loss: 0.8197 - val_accuracy: 0.6667 - lr: 0.0010 Epoch 22/100 473/473 [==============================] - ETA: 0s - loss: 0.7725 - accuracy: 0.6793 Epoch 22: val_accuracy did not improve from 0.67008 Epoch 22: ReduceLROnPlateau reducing learning rate to 0.00020000000949949026. 473/473 [==============================] - 19s 41ms/step - loss: 0.7725 - accuracy: 0.6793 - val_loss: 0.8669 - val_accuracy: 0.6436 - lr: 0.0010 Epoch 23/100 472/473 [============================>.] - ETA: 0s - loss: 0.7192 - accuracy: 0.7019 Epoch 23: val_accuracy improved from 0.67008 to 0.67852, saving model to ./model_1_grayscale.h5 473/473 [==============================] - 20s 43ms/step - loss: 0.7189 - accuracy: 0.7020 - val_loss: 0.7922 - val_accuracy: 0.6785 - lr: 2.0000e-04 Epoch 24/100 473/473 [==============================] - ETA: 0s - loss: 0.7088 - accuracy: 0.7083 Epoch 24: val_accuracy improved from 0.67852 to 0.68254, saving model to ./model_1_grayscale.h5 473/473 [==============================] - 19s 41ms/step - loss: 0.7088 - accuracy: 0.7083 - val_loss: 0.7970 - val_accuracy: 0.6825 - lr: 2.0000e-04 Epoch 25/100 472/473 [============================>.] - ETA: 0s - loss: 0.6970 - accuracy: 0.7094 Epoch 25: val_accuracy did not improve from 0.68254 473/473 [==============================] - 19s 41ms/step - loss: 0.6966 - accuracy: 0.7097 - val_loss: 0.8051 - val_accuracy: 0.6753 - lr: 2.0000e-04 Epoch 26/100 473/473 [==============================] - ETA: 0s - loss: 0.6992 - accuracy: 0.7145 Epoch 26: val_accuracy did not improve from 0.68254 Epoch 26: ReduceLROnPlateau reducing learning rate to 4.0000001899898055e-05. 473/473 [==============================] - 19s 41ms/step - loss: 0.6992 - accuracy: 0.7145 - val_loss: 0.7965 - val_accuracy: 0.6767 - lr: 2.0000e-04 Epoch 27/100 473/473 [==============================] - ETA: 0s - loss: 0.6865 - accuracy: 0.7171 Epoch 27: val_accuracy did not improve from 0.68254 473/473 [==============================] - 19s 41ms/step - loss: 0.6865 - accuracy: 0.7171 - val_loss: 0.7953 - val_accuracy: 0.6755 - lr: 4.0000e-05 Epoch 28/100 472/473 [============================>.] - ETA: 0s - loss: 0.6791 - accuracy: 0.7192 Epoch 28: val_accuracy did not improve from 0.68254 473/473 [==============================] - 19s 41ms/step - loss: 0.6788 - accuracy: 0.7192 - val_loss: 0.7843 - val_accuracy: 0.6819 - lr: 4.0000e-05 Epoch 29/100 473/473 [==============================] - ETA: 0s - loss: 0.6823 - accuracy: 0.7192 Epoch 29: val_accuracy improved from 0.68254 to 0.68354, saving model to ./model_1_grayscale.h5 473/473 [==============================] - 18s 39ms/step - loss: 0.6823 - accuracy: 0.7192 - val_loss: 0.7811 - val_accuracy: 0.6835 - lr: 4.0000e-05 Epoch 30/100 472/473 [============================>.] - ETA: 0s - loss: 0.6826 - accuracy: 0.7222 Epoch 30: val_accuracy did not improve from 0.68354 473/473 [==============================] - 18s 39ms/step - loss: 0.6825 - accuracy: 0.7222 - val_loss: 0.7958 - val_accuracy: 0.6791 - lr: 4.0000e-05 Epoch 31/100 473/473 [==============================] - ETA: 0s - loss: 0.6689 - accuracy: 0.7239 Epoch 31: val_accuracy improved from 0.68354 to 0.68475, saving model to ./model_1_grayscale.h5 473/473 [==============================] - 22s 47ms/step - loss: 0.6689 - accuracy: 0.7239 - val_loss: 0.7872 - val_accuracy: 0.6847 - lr: 4.0000e-05 Epoch 32/100 472/473 [============================>.] - ETA: 0s - loss: 0.6743 - accuracy: 0.7239 Epoch 32: val_accuracy did not improve from 0.68475 Epoch 32: ReduceLROnPlateau reducing learning rate to 8.000000525498762e-06. 473/473 [==============================] - 21s 44ms/step - loss: 0.6742 - accuracy: 0.7239 - val_loss: 0.7912 - val_accuracy: 0.6811 - lr: 4.0000e-05 Epoch 33/100 473/473 [==============================] - ETA: 0s - loss: 0.6665 - accuracy: 0.7237 Epoch 33: val_accuracy did not improve from 0.68475 473/473 [==============================] - 19s 40ms/step - loss: 0.6665 - accuracy: 0.7237 - val_loss: 0.7818 - val_accuracy: 0.6827 - lr: 8.0000e-06 Epoch 34/100 472/473 [============================>.] - ETA: 0s - loss: 0.6728 - accuracy: 0.7251 Epoch 34: val_accuracy improved from 0.68475 to 0.68676, saving model to ./model_1_grayscale.h5 Restoring model weights from the end of the best epoch: 29. 473/473 [==============================] - 20s 43ms/step - loss: 0.6730 - accuracy: 0.7251 - val_loss: 0.7828 - val_accuracy: 0.6868 - lr: 8.0000e-06 Epoch 34: early stopping
# Plotting the accuracies
plt.figure(figsize = (10, 5))
plt.plot(history_1_grayscale.history['accuracy'])
plt.plot(history_1_grayscale.history['val_accuracy'])
plt.title('Accuracy - Model 1 (Grayscale)')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Training', 'Validation'], loc='lower right')
plt.show()
# Plotting the losses
plt.figure(figsize = (10, 5))
plt.plot(history_1_grayscale.history['loss'])
plt.plot(history_1_grayscale.history['val_loss'])
plt.title('Loss - Model 1 (Grayscale)')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['Training', 'Validation'], loc='upper right')
plt.show()
# Evaluating the model's performance on the test set
accuracy = model_1_grayscale.evaluate(test_set_grayscale)
4/4 [==============================] - 0s 21ms/step - loss: 0.8219 - accuracy: 0.6484
Observations and Insights:
As constructed, our baseline grayscale model performs decently. After 29 epochs (best epoch), training accuracy stands at 0.72 and validation accuracy is 0.68. Training accuracy and loss continue to improve, while validation accuracy and loss begin to level off before early-stopping ends the training process. Accuracy on the test set is 0.65. A glance at the results, and the accuracy/loss graphs above, reveals a model that is overfitting and consequently has some room for improvement.
| Training | Validation | Test | |
|---|---|---|---|
| Grayscale Accuracy | 0.72 | 0.68 | 0.65 |
Note:
This baseline model will contain the same architecture as the above grayscale model. Due to the input shape changing from 48,48,1 (grayscale) to 48,48,3 (rgb), the total trainable parameters have increased to 605,572.
# Creating a Sequential model
model_1_rgb = Sequential()
# Convolutional Block #1
model_1_rgb.add(Conv2D(64, (2, 2), input_shape = (48, 48, 3), activation='relu', padding = 'same'))
model_1_rgb.add(MaxPooling2D(2, 2))
model_1_rgb.add(Dropout(0.2))
# Convolutional Block #2
model_1_rgb.add(Conv2D(32, (2, 2), activation='relu', padding = 'same'))
model_1_rgb.add(MaxPooling2D(2, 2))
model_1_rgb.add(Dropout(0.2))
# Convolutional Block #3
model_1_rgb.add(Conv2D(32, (2, 2), activation='relu', padding = 'same'))
model_1_rgb.add(MaxPooling2D(2, 2))
model_1_rgb.add(Dropout(0.2))
# Flatten layer
model_1_rgb.add(Flatten())
# Dense layer
model_1_rgb.add(Dense(512, activation = 'relu'))
# Classifier
model_1_rgb.add(Dense(4, activation = 'softmax'))
model_1_rgb.summary()
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_3 (Conv2D) (None, 48, 48, 64) 832
max_pooling2d_3 (MaxPooling (None, 24, 24, 64) 0
2D)
dropout_3 (Dropout) (None, 24, 24, 64) 0
conv2d_4 (Conv2D) (None, 24, 24, 32) 8224
max_pooling2d_4 (MaxPooling (None, 12, 12, 32) 0
2D)
dropout_4 (Dropout) (None, 12, 12, 32) 0
conv2d_5 (Conv2D) (None, 12, 12, 32) 4128
max_pooling2d_5 (MaxPooling (None, 6, 6, 32) 0
2D)
dropout_5 (Dropout) (None, 6, 6, 32) 0
flatten_1 (Flatten) (None, 1152) 0
dense_2 (Dense) (None, 512) 590336
dense_3 (Dense) (None, 4) 2052
=================================================================
Total params: 605,572
Trainable params: 605,572
Non-trainable params: 0
_________________________________________________________________
# Creating a checkpoint which saves model weights from the best epoch
checkpoint = ModelCheckpoint("./model_1_rgb.h5", monitor='val_accuracy', verbose=1, save_best_only=True, mode='auto')
# Initiates early stopping if validation loss does not continue to improve
early_stopping = EarlyStopping(monitor = 'val_loss',
min_delta = 0,
patience = 5,
verbose = 1,
restore_best_weights = True)
# Initiates reduced learning rate if validation loss does not continue to improve
reduce_learningrate = ReduceLROnPlateau(monitor = 'val_loss',
factor = 0.2,
patience = 3,
verbose = 1,
min_delta = 0.0001)
callbacks_list = [checkpoint, early_stopping, reduce_learningrate]
# Compiling model with optimizer set to Adam, loss set to categorical_crossentropy, and metrics set to accuracy
model_1_rgb.compile(optimizer = Adam(learning_rate = 0.001), loss = 'categorical_crossentropy', metrics = ['accuracy'])
# Fitting model with epochs set to 100
history_1_rgb = model_1_rgb.fit(train_set_rgb, validation_data = val_set_rgb, epochs = 100, callbacks = callbacks_list)
Epoch 1/100 472/473 [============================>.] - ETA: 0s - loss: 1.3419 - accuracy: 0.3287 Epoch 1: val_accuracy improved from -inf to 0.42958, saving model to ./model_1_rgb.h5 473/473 [==============================] - 31s 64ms/step - loss: 1.3417 - accuracy: 0.3291 - val_loss: 1.2416 - val_accuracy: 0.4296 - lr: 0.0010 Epoch 2/100 473/473 [==============================] - ETA: 0s - loss: 1.1906 - accuracy: 0.4685 Epoch 2: val_accuracy improved from 0.42958 to 0.53486, saving model to ./model_1_rgb.h5 473/473 [==============================] - 23s 49ms/step - loss: 1.1906 - accuracy: 0.4685 - val_loss: 1.1061 - val_accuracy: 0.5349 - lr: 0.0010 Epoch 3/100 472/473 [============================>.] - ETA: 0s - loss: 1.0946 - accuracy: 0.5112 Epoch 3: val_accuracy improved from 0.53486 to 0.56399, saving model to ./model_1_rgb.h5 473/473 [==============================] - 24s 50ms/step - loss: 1.0946 - accuracy: 0.5110 - val_loss: 1.0196 - val_accuracy: 0.5640 - lr: 0.0010 Epoch 4/100 473/473 [==============================] - ETA: 0s - loss: 1.0434 - accuracy: 0.5458 Epoch 4: val_accuracy improved from 0.56399 to 0.59494, saving model to ./model_1_rgb.h5 473/473 [==============================] - 24s 50ms/step - loss: 1.0434 - accuracy: 0.5458 - val_loss: 0.9597 - val_accuracy: 0.5949 - lr: 0.0010 Epoch 5/100 472/473 [============================>.] - ETA: 0s - loss: 0.9967 - accuracy: 0.5700 Epoch 5: val_accuracy improved from 0.59494 to 0.60981, saving model to ./model_1_rgb.h5 473/473 [==============================] - 24s 52ms/step - loss: 0.9967 - accuracy: 0.5701 - val_loss: 0.9334 - val_accuracy: 0.6098 - lr: 0.0010 Epoch 6/100 472/473 [============================>.] - ETA: 0s - loss: 0.9653 - accuracy: 0.5859 Epoch 6: val_accuracy improved from 0.60981 to 0.62186, saving model to ./model_1_rgb.h5 473/473 [==============================] - 24s 51ms/step - loss: 0.9653 - accuracy: 0.5859 - val_loss: 0.9016 - val_accuracy: 0.6219 - lr: 0.0010 Epoch 7/100 473/473 [==============================] - ETA: 0s - loss: 0.9419 - accuracy: 0.6026 Epoch 7: val_accuracy improved from 0.62186 to 0.63050, saving model to ./model_1_rgb.h5 473/473 [==============================] - 24s 51ms/step - loss: 0.9419 - accuracy: 0.6026 - val_loss: 0.8797 - val_accuracy: 0.6305 - lr: 0.0010 Epoch 8/100 472/473 [============================>.] - ETA: 0s - loss: 0.9108 - accuracy: 0.6107 Epoch 8: val_accuracy did not improve from 0.63050 473/473 [==============================] - 24s 51ms/step - loss: 0.9107 - accuracy: 0.6106 - val_loss: 0.8758 - val_accuracy: 0.6299 - lr: 0.0010 Epoch 9/100 472/473 [============================>.] - ETA: 0s - loss: 0.8963 - accuracy: 0.6189 Epoch 9: val_accuracy improved from 0.63050 to 0.63954, saving model to ./model_1_rgb.h5 473/473 [==============================] - 24s 50ms/step - loss: 0.8962 - accuracy: 0.6189 - val_loss: 0.8507 - val_accuracy: 0.6395 - lr: 0.0010 Epoch 10/100 472/473 [============================>.] - ETA: 0s - loss: 0.8727 - accuracy: 0.6261 Epoch 10: val_accuracy improved from 0.63954 to 0.64014, saving model to ./model_1_rgb.h5 473/473 [==============================] - 24s 51ms/step - loss: 0.8729 - accuracy: 0.6260 - val_loss: 0.8610 - val_accuracy: 0.6401 - lr: 0.0010 Epoch 11/100 473/473 [==============================] - ETA: 0s - loss: 0.8577 - accuracy: 0.6384 Epoch 11: val_accuracy improved from 0.64014 to 0.64597, saving model to ./model_1_rgb.h5 473/473 [==============================] - 24s 51ms/step - loss: 0.8577 - accuracy: 0.6384 - val_loss: 0.8408 - val_accuracy: 0.6460 - lr: 0.0010 Epoch 12/100 473/473 [==============================] - ETA: 0s - loss: 0.8406 - accuracy: 0.6428 Epoch 12: val_accuracy did not improve from 0.64597 473/473 [==============================] - 24s 51ms/step - loss: 0.8406 - accuracy: 0.6428 - val_loss: 0.8567 - val_accuracy: 0.6401 - lr: 0.0010 Epoch 13/100 472/473 [============================>.] - ETA: 0s - loss: 0.8291 - accuracy: 0.6538 Epoch 13: val_accuracy improved from 0.64597 to 0.64678, saving model to ./model_1_rgb.h5 473/473 [==============================] - 24s 51ms/step - loss: 0.8292 - accuracy: 0.6537 - val_loss: 0.8581 - val_accuracy: 0.6468 - lr: 0.0010 Epoch 14/100 473/473 [==============================] - ETA: 0s - loss: 0.8032 - accuracy: 0.6660 Epoch 14: val_accuracy improved from 0.64678 to 0.65984, saving model to ./model_1_rgb.h5 473/473 [==============================] - 24s 51ms/step - loss: 0.8032 - accuracy: 0.6660 - val_loss: 0.8278 - val_accuracy: 0.6598 - lr: 0.0010 Epoch 15/100 472/473 [============================>.] - ETA: 0s - loss: 0.8048 - accuracy: 0.6711 Epoch 15: val_accuracy did not improve from 0.65984 473/473 [==============================] - 24s 51ms/step - loss: 0.8044 - accuracy: 0.6711 - val_loss: 0.8348 - val_accuracy: 0.6488 - lr: 0.0010 Epoch 16/100 473/473 [==============================] - ETA: 0s - loss: 0.7877 - accuracy: 0.6728 Epoch 16: val_accuracy improved from 0.65984 to 0.66245, saving model to ./model_1_rgb.h5 473/473 [==============================] - 24s 50ms/step - loss: 0.7877 - accuracy: 0.6728 - val_loss: 0.8197 - val_accuracy: 0.6624 - lr: 0.0010 Epoch 17/100 473/473 [==============================] - ETA: 0s - loss: 0.7711 - accuracy: 0.6781 Epoch 17: val_accuracy improved from 0.66245 to 0.66848, saving model to ./model_1_rgb.h5 473/473 [==============================] - 24s 50ms/step - loss: 0.7711 - accuracy: 0.6781 - val_loss: 0.8042 - val_accuracy: 0.6685 - lr: 0.0010 Epoch 18/100 472/473 [============================>.] - ETA: 0s - loss: 0.7557 - accuracy: 0.6853 Epoch 18: val_accuracy improved from 0.66848 to 0.67410, saving model to ./model_1_rgb.h5 473/473 [==============================] - 25s 52ms/step - loss: 0.7553 - accuracy: 0.6855 - val_loss: 0.8002 - val_accuracy: 0.6741 - lr: 0.0010 Epoch 19/100 473/473 [==============================] - ETA: 0s - loss: 0.7483 - accuracy: 0.6945 Epoch 19: val_accuracy did not improve from 0.67410 473/473 [==============================] - 24s 51ms/step - loss: 0.7483 - accuracy: 0.6945 - val_loss: 0.8005 - val_accuracy: 0.6707 - lr: 0.0010 Epoch 20/100 472/473 [============================>.] - ETA: 0s - loss: 0.7383 - accuracy: 0.6952 Epoch 20: val_accuracy did not improve from 0.67410 473/473 [==============================] - 24s 51ms/step - loss: 0.7396 - accuracy: 0.6950 - val_loss: 0.7997 - val_accuracy: 0.6711 - lr: 0.0010 Epoch 21/100 472/473 [============================>.] - ETA: 0s - loss: 0.7222 - accuracy: 0.7033 Epoch 21: val_accuracy did not improve from 0.67410 473/473 [==============================] - 24s 51ms/step - loss: 0.7223 - accuracy: 0.7031 - val_loss: 0.8052 - val_accuracy: 0.6711 - lr: 0.0010 Epoch 22/100 472/473 [============================>.] - ETA: 0s - loss: 0.7082 - accuracy: 0.7097 Epoch 22: val_accuracy improved from 0.67410 to 0.67732, saving model to ./model_1_rgb.h5 473/473 [==============================] - 24s 51ms/step - loss: 0.7083 - accuracy: 0.7096 - val_loss: 0.8101 - val_accuracy: 0.6773 - lr: 0.0010 Epoch 23/100 472/473 [============================>.] - ETA: 0s - loss: 0.6996 - accuracy: 0.7115 Epoch 23: val_accuracy did not improve from 0.67732 473/473 [==============================] - 25s 52ms/step - loss: 0.6999 - accuracy: 0.7113 - val_loss: 0.7905 - val_accuracy: 0.6765 - lr: 0.0010 Epoch 24/100 473/473 [==============================] - ETA: 0s - loss: 0.6804 - accuracy: 0.7218 Epoch 24: val_accuracy improved from 0.67732 to 0.68234, saving model to ./model_1_rgb.h5 473/473 [==============================] - 24s 51ms/step - loss: 0.6804 - accuracy: 0.7218 - val_loss: 0.7789 - val_accuracy: 0.6823 - lr: 0.0010 Epoch 25/100 472/473 [============================>.] - ETA: 0s - loss: 0.6712 - accuracy: 0.7257 Epoch 25: val_accuracy did not improve from 0.68234 473/473 [==============================] - 24s 51ms/step - loss: 0.6713 - accuracy: 0.7257 - val_loss: 0.8476 - val_accuracy: 0.6673 - lr: 0.0010 Epoch 26/100 473/473 [==============================] - ETA: 0s - loss: 0.6590 - accuracy: 0.7343 Epoch 26: val_accuracy did not improve from 0.68234 473/473 [==============================] - 24s 51ms/step - loss: 0.6590 - accuracy: 0.7343 - val_loss: 0.8357 - val_accuracy: 0.6669 - lr: 0.0010 Epoch 27/100 473/473 [==============================] - ETA: 0s - loss: 0.6475 - accuracy: 0.7387 Epoch 27: val_accuracy did not improve from 0.68234 Epoch 27: ReduceLROnPlateau reducing learning rate to 0.00020000000949949026. 473/473 [==============================] - 24s 51ms/step - loss: 0.6475 - accuracy: 0.7387 - val_loss: 0.8530 - val_accuracy: 0.6602 - lr: 0.0010 Epoch 28/100 473/473 [==============================] - ETA: 0s - loss: 0.5849 - accuracy: 0.7625 Epoch 28: val_accuracy improved from 0.68234 to 0.68495, saving model to ./model_1_rgb.h5 473/473 [==============================] - 24s 52ms/step - loss: 0.5849 - accuracy: 0.7625 - val_loss: 0.7955 - val_accuracy: 0.6850 - lr: 2.0000e-04 Epoch 29/100 472/473 [============================>.] - ETA: 0s - loss: 0.5771 - accuracy: 0.7661 Epoch 29: val_accuracy did not improve from 0.68495 Restoring model weights from the end of the best epoch: 24. 473/473 [==============================] - 24s 50ms/step - loss: 0.5769 - accuracy: 0.7661 - val_loss: 0.8159 - val_accuracy: 0.6833 - lr: 2.0000e-04 Epoch 29: early stopping
# Plotting the accuracies
plt.figure(figsize = (10, 5))
plt.plot(history_1_rgb.history['accuracy'])
plt.plot(history_1_rgb.history['val_accuracy'])
plt.title('Accuracy - Model 1 (RGB)')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Training', 'Validation'], loc='lower right')
plt.show()
# Plotting the losses
plt.figure(figsize = (10, 5))
plt.plot(history_1_rgb.history['loss'])
plt.plot(history_1_rgb.history['val_loss'])
plt.title('Loss - Model 1 (RGB)')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['Training', 'Validation'], loc='upper right')
plt.show()
# Evaluating the model's performance on the test set
accuracy = model_1_rgb.evaluate(test_set_rgb)
4/4 [==============================] - 0s 29ms/step - loss: 0.8046 - accuracy: 0.6328
Observations and Insights:
As constructed, our baseline RGB model also performs decently. After 24 epochs (best epoch), training accuracy stands at 0.72 and validation accuracy is 0.68. Training accuracy and loss continue to improve, while validation accuracy and loss begin to level off before early-stopping ends the training process. Accuracy on the test set is 0.63.
Our baseline grayscale and RGB models perform similarly across all metrics. Overall, both models underfit the data for 10-15 epochs, likely due to the addition of Dropout layers in the model architecture, after which the models begin to overfit the data, performing similarly. Perhaps a slight edge to the grayscale model for performing better on the test set with a smaller number of trainable parameters, making it computationally less expensive when scaled.
| Training | Validation | Test | |
|---|---|---|---|
| Grayscale Accuracy | 0.72 | 0.68 | 0.65 |
| RGB Accuracy | 0.72 | 0.68 | 0.63 |
# Plotting the accuracies
plt.figure(figsize = (10, 5))
plt.plot(history_1_grayscale.history['accuracy'])
plt.plot(history_1_grayscale.history['val_accuracy'])
plt.plot(history_1_rgb.history['accuracy'])
plt.plot(history_1_rgb.history['val_accuracy'])
plt.title('Accuracy - Model 1 (Grayscale & RGB)')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Training Accuracy (Grayscale)', 'Validation Accuracy (Grayscale)',
'Training Accuracy (RGB)', 'Validation Accuracy (RGB)'], loc='lower right')
plt.show()
Note:
We will now build a slightly deeper model to see if we can improve performance. Similar to our baseline models, we will train this model with color_modes of grayscale and RGB so we can compare performance.
The architecture of our second model is comprised of 4 convolutional blocks with relu activation, BatchNormalization, a LeakyReLu layer, and MaxPooling, followed by a dense layer with 512 neurons, another dense layer with 256 neurons, and finally a softmax classifier. The grayscale model has a total of 455,780 parameters.
# Creating a Sequential model
model_2_grayscale = Sequential()
# Convolutional Block #1
model_2_grayscale.add(Conv2D(256, (2, 2), input_shape = (48, 48, 1), activation='relu', padding = 'same'))
model_2_grayscale.add(BatchNormalization())
model_2_grayscale.add(LeakyReLU(alpha = 0.1))
model_2_grayscale.add(MaxPooling2D(2, 2))
# Convolutional Block #2
model_2_grayscale.add(Conv2D(128, (2, 2), activation='relu', padding = 'same'))
model_2_grayscale.add(BatchNormalization())
model_2_grayscale.add(LeakyReLU(alpha = 0.1))
model_2_grayscale.add(MaxPooling2D(2, 2))
# Convolutional Block #3
model_2_grayscale.add(Conv2D(64, (2, 2), activation='relu', padding = 'same'))
model_2_grayscale.add(BatchNormalization())
model_2_grayscale.add(LeakyReLU(alpha = 0.1))
model_2_grayscale.add(MaxPooling2D(2, 2))
# Convolutional Block #4
model_2_grayscale.add(Conv2D(32, (2, 2), activation='relu', padding = 'same'))
model_2_grayscale.add(BatchNormalization())
model_2_grayscale.add(LeakyReLU(alpha = 0.1))
model_2_grayscale.add(MaxPooling2D(2, 2))
# Flatten layer
model_2_grayscale.add(Flatten())
# Dense layers
model_2_grayscale.add(Dense(512, activation = 'relu'))
model_2_grayscale.add(Dense(256, activation = 'relu'))
# Classifier
model_2_grayscale.add(Dense(4, activation = 'softmax'))
model_2_grayscale.summary()
Model: "sequential_2"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_6 (Conv2D) (None, 48, 48, 256) 1280
batch_normalization (BatchN (None, 48, 48, 256) 1024
ormalization)
leaky_re_lu (LeakyReLU) (None, 48, 48, 256) 0
max_pooling2d_6 (MaxPooling (None, 24, 24, 256) 0
2D)
conv2d_7 (Conv2D) (None, 24, 24, 128) 131200
batch_normalization_1 (Batc (None, 24, 24, 128) 512
hNormalization)
leaky_re_lu_1 (LeakyReLU) (None, 24, 24, 128) 0
max_pooling2d_7 (MaxPooling (None, 12, 12, 128) 0
2D)
conv2d_8 (Conv2D) (None, 12, 12, 64) 32832
batch_normalization_2 (Batc (None, 12, 12, 64) 256
hNormalization)
leaky_re_lu_2 (LeakyReLU) (None, 12, 12, 64) 0
max_pooling2d_8 (MaxPooling (None, 6, 6, 64) 0
2D)
conv2d_9 (Conv2D) (None, 6, 6, 32) 8224
batch_normalization_3 (Batc (None, 6, 6, 32) 128
hNormalization)
leaky_re_lu_3 (LeakyReLU) (None, 6, 6, 32) 0
max_pooling2d_9 (MaxPooling (None, 3, 3, 32) 0
2D)
flatten_2 (Flatten) (None, 288) 0
dense_4 (Dense) (None, 512) 147968
dense_5 (Dense) (None, 256) 131328
dense_6 (Dense) (None, 4) 1028
=================================================================
Total params: 455,780
Trainable params: 454,820
Non-trainable params: 960
_________________________________________________________________
# Creating a checkpoint which saves model weights from the best epoch
checkpoint = ModelCheckpoint("./model_2_grayscale.h5", monitor='val_accuracy', verbose=1, save_best_only=True, mode='auto')
# Initiates early stopping if validation loss does not continue to improve
early_stopping = EarlyStopping(monitor = 'val_loss',
min_delta = 0,
patience = 5,
verbose = 1,
restore_best_weights = True)
# Initiates reduced learning rate if validation loss does not continue to improve
reduce_learningrate = ReduceLROnPlateau(monitor = 'val_loss',
factor = 0.2,
patience = 3,
verbose = 1,
min_delta = 0.0001)
callbacks_list = [checkpoint, early_stopping, reduce_learningrate]
# Compiling model with optimizer set to Adam, loss set to categorical_crossentropy, and metrics set to accuracy
model_2_grayscale.compile(optimizer = Adam(learning_rate = 0.001), loss = 'categorical_crossentropy', metrics = ['accuracy'])
# Fitting model with epochs set to 100
history_2_grayscale = model_2_grayscale.fit(train_set_grayscale, validation_data = val_set_grayscale, epochs = 100, callbacks = callbacks_list)
Epoch 1/100 473/473 [==============================] - ETA: 0s - loss: 1.2684 - accuracy: 0.4053 Epoch 1: val_accuracy improved from -inf to 0.42134, saving model to ./model_2_grayscale.h5 473/473 [==============================] - 43s 89ms/step - loss: 1.2684 - accuracy: 0.4053 - val_loss: 1.2783 - val_accuracy: 0.4213 - lr: 0.0010 Epoch 2/100 473/473 [==============================] - ETA: 0s - loss: 1.0451 - accuracy: 0.5404 Epoch 2: val_accuracy improved from 0.42134 to 0.58368, saving model to ./model_2_grayscale.h5 473/473 [==============================] - 36s 75ms/step - loss: 1.0451 - accuracy: 0.5404 - val_loss: 0.9890 - val_accuracy: 0.5837 - lr: 0.0010 Epoch 3/100 473/473 [==============================] - ETA: 0s - loss: 0.9524 - accuracy: 0.5900 Epoch 3: val_accuracy did not improve from 0.58368 473/473 [==============================] - 37s 79ms/step - loss: 0.9524 - accuracy: 0.5900 - val_loss: 0.9803 - val_accuracy: 0.5694 - lr: 0.0010 Epoch 4/100 473/473 [==============================] - ETA: 0s - loss: 0.8941 - accuracy: 0.6146 Epoch 4: val_accuracy improved from 0.58368 to 0.61061, saving model to ./model_2_grayscale.h5 473/473 [==============================] - 44s 92ms/step - loss: 0.8941 - accuracy: 0.6146 - val_loss: 0.9132 - val_accuracy: 0.6106 - lr: 0.0010 Epoch 5/100 473/473 [==============================] - ETA: 0s - loss: 0.8503 - accuracy: 0.6401 Epoch 5: val_accuracy did not improve from 0.61061 473/473 [==============================] - 37s 79ms/step - loss: 0.8503 - accuracy: 0.6401 - val_loss: 0.9677 - val_accuracy: 0.5929 - lr: 0.0010 Epoch 6/100 473/473 [==============================] - ETA: 0s - loss: 0.8184 - accuracy: 0.6495 Epoch 6: val_accuracy improved from 0.61061 to 0.65863, saving model to ./model_2_grayscale.h5 473/473 [==============================] - 36s 76ms/step - loss: 0.8184 - accuracy: 0.6495 - val_loss: 0.8306 - val_accuracy: 0.6586 - lr: 0.0010 Epoch 7/100 473/473 [==============================] - ETA: 0s - loss: 0.7853 - accuracy: 0.6723 Epoch 7: val_accuracy did not improve from 0.65863 473/473 [==============================] - 36s 76ms/step - loss: 0.7853 - accuracy: 0.6723 - val_loss: 0.8979 - val_accuracy: 0.6225 - lr: 0.0010 Epoch 8/100 473/473 [==============================] - ETA: 0s - loss: 0.7632 - accuracy: 0.6789 Epoch 8: val_accuracy did not improve from 0.65863 473/473 [==============================] - 36s 77ms/step - loss: 0.7632 - accuracy: 0.6789 - val_loss: 0.9091 - val_accuracy: 0.6205 - lr: 0.0010 Epoch 9/100 473/473 [==============================] - ETA: 0s - loss: 0.7443 - accuracy: 0.6879 Epoch 9: val_accuracy did not improve from 0.65863 473/473 [==============================] - 42s 90ms/step - loss: 0.7443 - accuracy: 0.6879 - val_loss: 0.8299 - val_accuracy: 0.6540 - lr: 0.0010 Epoch 10/100 473/473 [==============================] - ETA: 0s - loss: 0.7266 - accuracy: 0.6940 Epoch 10: val_accuracy improved from 0.65863 to 0.67229, saving model to ./model_2_grayscale.h5 473/473 [==============================] - 41s 86ms/step - loss: 0.7266 - accuracy: 0.6940 - val_loss: 0.7992 - val_accuracy: 0.6723 - lr: 0.0010 Epoch 11/100 473/473 [==============================] - ETA: 0s - loss: 0.7089 - accuracy: 0.7046 Epoch 11: val_accuracy did not improve from 0.67229 473/473 [==============================] - 38s 79ms/step - loss: 0.7089 - accuracy: 0.7046 - val_loss: 0.8848 - val_accuracy: 0.6418 - lr: 0.0010 Epoch 12/100 473/473 [==============================] - ETA: 0s - loss: 0.6910 - accuracy: 0.7096 Epoch 12: val_accuracy improved from 0.67229 to 0.68194, saving model to ./model_2_grayscale.h5 473/473 [==============================] - 37s 78ms/step - loss: 0.6910 - accuracy: 0.7096 - val_loss: 0.7942 - val_accuracy: 0.6819 - lr: 0.0010 Epoch 13/100 473/473 [==============================] - ETA: 0s - loss: 0.6705 - accuracy: 0.7235 Epoch 13: val_accuracy did not improve from 0.68194 473/473 [==============================] - 36s 77ms/step - loss: 0.6705 - accuracy: 0.7235 - val_loss: 0.8160 - val_accuracy: 0.6815 - lr: 0.0010 Epoch 14/100 473/473 [==============================] - ETA: 0s - loss: 0.6581 - accuracy: 0.7286 Epoch 14: val_accuracy improved from 0.68194 to 0.68857, saving model to ./model_2_grayscale.h5 473/473 [==============================] - 38s 81ms/step - loss: 0.6581 - accuracy: 0.7286 - val_loss: 0.7747 - val_accuracy: 0.6886 - lr: 0.0010 Epoch 15/100 473/473 [==============================] - ETA: 0s - loss: 0.6475 - accuracy: 0.7255 Epoch 15: val_accuracy did not improve from 0.68857 473/473 [==============================] - 36s 77ms/step - loss: 0.6475 - accuracy: 0.7255 - val_loss: 0.7918 - val_accuracy: 0.6769 - lr: 0.0010 Epoch 16/100 473/473 [==============================] - ETA: 0s - loss: 0.6280 - accuracy: 0.7408 Epoch 16: val_accuracy did not improve from 0.68857 473/473 [==============================] - 37s 78ms/step - loss: 0.6280 - accuracy: 0.7408 - val_loss: 0.7942 - val_accuracy: 0.6701 - lr: 0.0010 Epoch 17/100 473/473 [==============================] - ETA: 0s - loss: 0.6140 - accuracy: 0.7495 Epoch 17: val_accuracy did not improve from 0.68857 Epoch 17: ReduceLROnPlateau reducing learning rate to 0.00020000000949949026. 473/473 [==============================] - 38s 80ms/step - loss: 0.6140 - accuracy: 0.7495 - val_loss: 0.7844 - val_accuracy: 0.6874 - lr: 0.0010 Epoch 18/100 473/473 [==============================] - ETA: 0s - loss: 0.5395 - accuracy: 0.7787 Epoch 18: val_accuracy improved from 0.68857 to 0.71469, saving model to ./model_2_grayscale.h5 473/473 [==============================] - 39s 82ms/step - loss: 0.5395 - accuracy: 0.7787 - val_loss: 0.7416 - val_accuracy: 0.7147 - lr: 2.0000e-04 Epoch 19/100 473/473 [==============================] - ETA: 0s - loss: 0.5027 - accuracy: 0.7936 Epoch 19: val_accuracy improved from 0.71469 to 0.72011, saving model to ./model_2_grayscale.h5 473/473 [==============================] - 37s 79ms/step - loss: 0.5027 - accuracy: 0.7936 - val_loss: 0.7523 - val_accuracy: 0.7201 - lr: 2.0000e-04 Epoch 20/100 473/473 [==============================] - ETA: 0s - loss: 0.4969 - accuracy: 0.7956 Epoch 20: val_accuracy did not improve from 0.72011 473/473 [==============================] - 36s 77ms/step - loss: 0.4969 - accuracy: 0.7956 - val_loss: 0.7552 - val_accuracy: 0.7141 - lr: 2.0000e-04 Epoch 21/100 473/473 [==============================] - ETA: 0s - loss: 0.4704 - accuracy: 0.8069 Epoch 21: val_accuracy did not improve from 0.72011 Epoch 21: ReduceLROnPlateau reducing learning rate to 4.0000001899898055e-05. 473/473 [==============================] - 36s 77ms/step - loss: 0.4704 - accuracy: 0.8069 - val_loss: 0.7567 - val_accuracy: 0.7195 - lr: 2.0000e-04 Epoch 22/100 473/473 [==============================] - ETA: 0s - loss: 0.4488 - accuracy: 0.8198 Epoch 22: val_accuracy did not improve from 0.72011 473/473 [==============================] - 37s 77ms/step - loss: 0.4488 - accuracy: 0.8198 - val_loss: 0.7835 - val_accuracy: 0.7125 - lr: 4.0000e-05 Epoch 23/100 473/473 [==============================] - ETA: 0s - loss: 0.4423 - accuracy: 0.8201 Epoch 23: val_accuracy did not improve from 0.72011 Restoring model weights from the end of the best epoch: 18. 473/473 [==============================] - 38s 80ms/step - loss: 0.4423 - accuracy: 0.8201 - val_loss: 0.7753 - val_accuracy: 0.7195 - lr: 4.0000e-05 Epoch 23: early stopping
# Plotting the accuracies
plt.figure(figsize = (10, 5))
plt.plot(history_2_grayscale.history['accuracy'])
plt.plot(history_2_grayscale.history['val_accuracy'])
plt.title('Accuracy - Model 2 (Grayscale)')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Training', 'Validation'], loc='lower right')
plt.show()
# Plotting the losses
plt.figure(figsize = (10, 5))
plt.plot(history_2_grayscale.history['loss'])
plt.plot(history_2_grayscale.history['val_loss'])
plt.title('Loss - Model 2 (Grayscale)')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['Training', 'Validation'], loc='upper right')
plt.show()
accuracy = model_2_grayscale.evaluate(test_set_grayscale)
4/4 [==============================] - 0s 29ms/step - loss: 0.8122 - accuracy: 0.6875
Observations and Insights:
As constructed, our second, deeper grayscale model performs somewhat differently than its predecessor. After 18 epochs (best epoch), training accuracy stands at 0.78 and validation accuracy is 0.71, which are both higher than Model 1, but Model 2 begins to overfit almost immediately, and the gaps between training and accuracy scores only grow from there. Training accuracy and loss continue to improve, while validation accuracy and loss begin to level off before early-stopping ends the training process. Accuracy on the test set is 0.69. Our model is not generalizing well, though with better accuracy scores compared to Model 1, there is an opportunity (if overfitting can be reduced) to become the better grayscale model.
| Training | Validation | Test | |
|---|---|---|---|
| Grayscale Accuracy | 0.78 | 0.71 | 0.69 |
Note:
This model will contain the same architecture as the above grayscale model. Due to the input shape changing from 48,48,1 (grayscale) to 48,48,3 (rgb), the total parameters have increased to 457,828.
# Creating a Sequential model
model_2_rgb = Sequential()
# Convolutional Block #1
model_2_rgb.add(Conv2D(256, (2, 2), input_shape = (48, 48, 3), activation='relu', padding = 'same'))
model_2_rgb.add(BatchNormalization())
model_2_rgb.add(LeakyReLU(alpha = 0.1))
model_2_rgb.add(MaxPooling2D(2, 2))
# Convolutional Block #2
model_2_rgb.add(Conv2D(128, (2, 2), activation='relu', padding = 'same'))
model_2_rgb.add(BatchNormalization())
model_2_rgb.add(LeakyReLU(alpha = 0.1))
model_2_rgb.add(MaxPooling2D(2, 2))
# Convolutional Block #3
model_2_rgb.add(Conv2D(64, (2, 2), activation='relu', padding = 'same'))
model_2_rgb.add(BatchNormalization())
model_2_rgb.add(LeakyReLU(alpha = 0.1))
model_2_rgb.add(MaxPooling2D(2, 2))
# Convolutional Block #4
model_2_rgb.add(Conv2D(32, (2, 2), activation='relu', padding = 'same'))
model_2_rgb.add(BatchNormalization())
model_2_rgb.add(LeakyReLU(alpha = 0.1))
model_2_rgb.add(MaxPooling2D(2, 2))
# Flatten layer
model_2_rgb.add(Flatten())
# Dense layers
model_2_rgb.add(Dense(512, activation = 'relu'))
model_2_rgb.add(Dense(256, activation = 'relu'))
# Classifier
model_2_rgb.add(Dense(4, activation = 'softmax'))
model_2_rgb.summary()
Model: "sequential_3"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_10 (Conv2D) (None, 48, 48, 256) 3328
batch_normalization_4 (Batc (None, 48, 48, 256) 1024
hNormalization)
leaky_re_lu_4 (LeakyReLU) (None, 48, 48, 256) 0
max_pooling2d_10 (MaxPoolin (None, 24, 24, 256) 0
g2D)
conv2d_11 (Conv2D) (None, 24, 24, 128) 131200
batch_normalization_5 (Batc (None, 24, 24, 128) 512
hNormalization)
leaky_re_lu_5 (LeakyReLU) (None, 24, 24, 128) 0
max_pooling2d_11 (MaxPoolin (None, 12, 12, 128) 0
g2D)
conv2d_12 (Conv2D) (None, 12, 12, 64) 32832
batch_normalization_6 (Batc (None, 12, 12, 64) 256
hNormalization)
leaky_re_lu_6 (LeakyReLU) (None, 12, 12, 64) 0
max_pooling2d_12 (MaxPoolin (None, 6, 6, 64) 0
g2D)
conv2d_13 (Conv2D) (None, 6, 6, 32) 8224
batch_normalization_7 (Batc (None, 6, 6, 32) 128
hNormalization)
leaky_re_lu_7 (LeakyReLU) (None, 6, 6, 32) 0
max_pooling2d_13 (MaxPoolin (None, 3, 3, 32) 0
g2D)
flatten_3 (Flatten) (None, 288) 0
dense_7 (Dense) (None, 512) 147968
dense_8 (Dense) (None, 256) 131328
dense_9 (Dense) (None, 4) 1028
=================================================================
Total params: 457,828
Trainable params: 456,868
Non-trainable params: 960
_________________________________________________________________
# Creating a checkpoint which saves model weights from the best epoch
checkpoint = ModelCheckpoint("./model_2_rgb.h5", monitor='val_accuracy', verbose=1, save_best_only=True, mode='auto')
# Initiates early stopping if validation loss does not continue to improve
early_stopping = EarlyStopping(monitor = 'val_loss',
min_delta = 0,
patience = 5,
verbose = 1,
restore_best_weights = True)
# Initiates reduced learning rate if validation loss does not continue to improve
reduce_learningrate = ReduceLROnPlateau(monitor = 'val_loss',
factor = 0.2,
patience = 3,
verbose = 1,
min_delta = 0.0001)
callbacks_list = [checkpoint, early_stopping, reduce_learningrate]
# Compiling model with optimizer set to Adam, loss set to categorical_crossentropy, and metrics set to accuracy
model_2_rgb.compile(optimizer = Adam(learning_rate = 0.001), loss = 'categorical_crossentropy', metrics = ['accuracy'])
# Fitting model with epochs set to 100
history_2_rgb = model_2_rgb.fit(train_set_rgb, validation_data = val_set_rgb, epochs = 100, callbacks = callbacks_list)
Epoch 1/100 473/473 [==============================] - ETA: 0s - loss: 1.2630 - accuracy: 0.4033 Epoch 1: val_accuracy improved from -inf to 0.39140, saving model to ./model_2_rgb.h5 473/473 [==============================] - 48s 101ms/step - loss: 1.2630 - accuracy: 0.4033 - val_loss: 1.2748 - val_accuracy: 0.3914 - lr: 0.0010 Epoch 2/100 473/473 [==============================] - ETA: 0s - loss: 1.0839 - accuracy: 0.5184 Epoch 2: val_accuracy improved from 0.39140 to 0.48021, saving model to ./model_2_rgb.h5 473/473 [==============================] - 41s 87ms/step - loss: 1.0839 - accuracy: 0.5184 - val_loss: 1.1498 - val_accuracy: 0.4802 - lr: 0.0010 Epoch 3/100 473/473 [==============================] - ETA: 0s - loss: 0.9816 - accuracy: 0.5715 Epoch 3: val_accuracy improved from 0.48021 to 0.59936, saving model to ./model_2_rgb.h5 473/473 [==============================] - 41s 86ms/step - loss: 0.9816 - accuracy: 0.5715 - val_loss: 0.9470 - val_accuracy: 0.5994 - lr: 0.0010 Epoch 4/100 473/473 [==============================] - ETA: 0s - loss: 0.9106 - accuracy: 0.6063 Epoch 4: val_accuracy improved from 0.59936 to 0.61001, saving model to ./model_2_rgb.h5 473/473 [==============================] - 41s 86ms/step - loss: 0.9106 - accuracy: 0.6063 - val_loss: 0.9275 - val_accuracy: 0.6100 - lr: 0.0010 Epoch 5/100 473/473 [==============================] - ETA: 0s - loss: 0.8590 - accuracy: 0.6297 Epoch 5: val_accuracy did not improve from 0.61001 473/473 [==============================] - 41s 86ms/step - loss: 0.8590 - accuracy: 0.6297 - val_loss: 0.9990 - val_accuracy: 0.5604 - lr: 0.0010 Epoch 6/100 473/473 [==============================] - ETA: 0s - loss: 0.8193 - accuracy: 0.6492 Epoch 6: val_accuracy did not improve from 0.61001 473/473 [==============================] - 42s 89ms/step - loss: 0.8193 - accuracy: 0.6492 - val_loss: 0.9861 - val_accuracy: 0.5895 - lr: 0.0010 Epoch 7/100 473/473 [==============================] - ETA: 0s - loss: 0.7907 - accuracy: 0.6661 Epoch 7: val_accuracy improved from 0.61001 to 0.67129, saving model to ./model_2_rgb.h5 473/473 [==============================] - 42s 90ms/step - loss: 0.7907 - accuracy: 0.6661 - val_loss: 0.7824 - val_accuracy: 0.6713 - lr: 0.0010 Epoch 8/100 473/473 [==============================] - ETA: 0s - loss: 0.7698 - accuracy: 0.6754 Epoch 8: val_accuracy did not improve from 0.67129 473/473 [==============================] - 47s 99ms/step - loss: 0.7698 - accuracy: 0.6754 - val_loss: 0.8390 - val_accuracy: 0.6562 - lr: 0.0010 Epoch 9/100 473/473 [==============================] - ETA: 0s - loss: 0.7466 - accuracy: 0.6871 Epoch 9: val_accuracy did not improve from 0.67129 473/473 [==============================] - 41s 86ms/step - loss: 0.7466 - accuracy: 0.6871 - val_loss: 0.8137 - val_accuracy: 0.6624 - lr: 0.0010 Epoch 10/100 473/473 [==============================] - ETA: 0s - loss: 0.7287 - accuracy: 0.6936 Epoch 10: val_accuracy improved from 0.67129 to 0.67932, saving model to ./model_2_rgb.h5 Epoch 10: ReduceLROnPlateau reducing learning rate to 0.00020000000949949026. 473/473 [==============================] - 41s 87ms/step - loss: 0.7287 - accuracy: 0.6936 - val_loss: 0.7865 - val_accuracy: 0.6793 - lr: 0.0010 Epoch 11/100 473/473 [==============================] - ETA: 0s - loss: 0.6521 - accuracy: 0.7298 Epoch 11: val_accuracy improved from 0.67932 to 0.70384, saving model to ./model_2_rgb.h5 473/473 [==============================] - 42s 88ms/step - loss: 0.6521 - accuracy: 0.7298 - val_loss: 0.7338 - val_accuracy: 0.7038 - lr: 2.0000e-04 Epoch 12/100 473/473 [==============================] - ETA: 0s - loss: 0.6267 - accuracy: 0.7419 Epoch 12: val_accuracy improved from 0.70384 to 0.71187, saving model to ./model_2_rgb.h5 473/473 [==============================] - 41s 87ms/step - loss: 0.6267 - accuracy: 0.7419 - val_loss: 0.7317 - val_accuracy: 0.7119 - lr: 2.0000e-04 Epoch 13/100 473/473 [==============================] - ETA: 0s - loss: 0.6081 - accuracy: 0.7499 Epoch 13: val_accuracy did not improve from 0.71187 473/473 [==============================] - 41s 88ms/step - loss: 0.6081 - accuracy: 0.7499 - val_loss: 0.7514 - val_accuracy: 0.7042 - lr: 2.0000e-04 Epoch 14/100 473/473 [==============================] - ETA: 0s - loss: 0.6010 - accuracy: 0.7519 Epoch 14: val_accuracy did not improve from 0.71187 473/473 [==============================] - 42s 88ms/step - loss: 0.6010 - accuracy: 0.7519 - val_loss: 0.7564 - val_accuracy: 0.7014 - lr: 2.0000e-04 Epoch 15/100 473/473 [==============================] - ETA: 0s - loss: 0.5884 - accuracy: 0.7584 Epoch 15: val_accuracy improved from 0.71187 to 0.71308, saving model to ./model_2_rgb.h5 473/473 [==============================] - 46s 97ms/step - loss: 0.5884 - accuracy: 0.7584 - val_loss: 0.7226 - val_accuracy: 0.7131 - lr: 2.0000e-04 Epoch 16/100 473/473 [==============================] - ETA: 0s - loss: 0.5740 - accuracy: 0.7657 Epoch 16: val_accuracy did not improve from 0.71308 473/473 [==============================] - 41s 87ms/step - loss: 0.5740 - accuracy: 0.7657 - val_loss: 0.7616 - val_accuracy: 0.6988 - lr: 2.0000e-04 Epoch 17/100 473/473 [==============================] - ETA: 0s - loss: 0.5579 - accuracy: 0.7734 Epoch 17: val_accuracy improved from 0.71308 to 0.71650, saving model to ./model_2_rgb.h5 473/473 [==============================] - 41s 88ms/step - loss: 0.5579 - accuracy: 0.7734 - val_loss: 0.7491 - val_accuracy: 0.7165 - lr: 2.0000e-04 Epoch 18/100 473/473 [==============================] - ETA: 0s - loss: 0.5524 - accuracy: 0.7742 Epoch 18: val_accuracy did not improve from 0.71650 Epoch 18: ReduceLROnPlateau reducing learning rate to 4.0000001899898055e-05. 473/473 [==============================] - 42s 89ms/step - loss: 0.5524 - accuracy: 0.7742 - val_loss: 0.7648 - val_accuracy: 0.7054 - lr: 2.0000e-04 Epoch 19/100 473/473 [==============================] - ETA: 0s - loss: 0.5191 - accuracy: 0.7883 Epoch 19: val_accuracy did not improve from 0.71650 473/473 [==============================] - 41s 86ms/step - loss: 0.5191 - accuracy: 0.7883 - val_loss: 0.7552 - val_accuracy: 0.7115 - lr: 4.0000e-05 Epoch 20/100 473/473 [==============================] - ETA: 0s - loss: 0.5135 - accuracy: 0.7895 Epoch 20: val_accuracy did not improve from 0.71650 Restoring model weights from the end of the best epoch: 15. 473/473 [==============================] - 45s 95ms/step - loss: 0.5135 - accuracy: 0.7895 - val_loss: 0.7533 - val_accuracy: 0.7101 - lr: 4.0000e-05 Epoch 20: early stopping
# Plotting the accuracies
plt.figure(figsize = (10, 5))
plt.plot(history_2_rgb.history['accuracy'])
plt.plot(history_2_rgb.history['val_accuracy'])
plt.title('Accuracy - Model 2 (RGB)')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Training', 'Validation'], loc='lower right')
plt.show()
# Plotting the losses
plt.figure(figsize = (10, 5))
plt.plot(history_2_rgb.history['loss'])
plt.plot(history_2_rgb.history['val_loss'])
plt.title('Loss - Model 2 (RGB)')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['Training', 'Validation'], loc='upper right')
plt.show()
# Evaluating the model's performance on the test set
accuracy = model_2_rgb.evaluate(test_set_rgb)
4/4 [==============================] - 0s 41ms/step - loss: 0.6983 - accuracy: 0.6797
Observations and Insights:
As constructed, our second RGB model also performs somewhat differently than its predecessor. After 15 epochs (best epoch), training accuracy stands at 0.76 and validation accuracy is 0.71, which are both higher than Model 1, but Model 2 begins to overfit almost immediately. Training accuracy and loss continue to improve, while validation accuracy and loss level off before early-stopping ends the training process. Accuracy on the test set is 0.68. Once again, our model is not generalizing well, though with better accuracy scores compared to Model 1, there is an opportunity (if overfitting can be reduced) to become the better RGB model.
Our deeper grayscale and RGB models again perform similarly across all metrics, with the grayscale model attaining slightly better accuracies. Again, a slight edge to the grayscale model for performing better on the test set with a smaller number of trainable parameters.
| Training | Validation | Test | |
|---|---|---|---|
| Grayscale Accuracy | 0.78 | 0.71 | 0.69 |
| RGB Accuracy | 0.76 | 0.71 | 0.68 |
# Plotting the accuracies
plt.figure(figsize = (10, 5))
plt.plot(history_2_grayscale.history['accuracy'])
plt.plot(history_2_grayscale.history['val_accuracy'])
plt.plot(history_2_rgb.history['accuracy'])
plt.plot(history_2_rgb.history['val_accuracy'])
plt.title('Accuracy - Model 2 (Grayscale & RGB)')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Training Accuracy (Grayscale)', 'Validation Accuracy (Grayscale)',
'Training Accuracy (RGB)', 'Validation Accuracy (RGB)'], loc='lower right')
plt.show()
Overall Observations and Insights on Initial Models:
# Plotting the accuracies
plt.figure(figsize = (10, 5))
plt.plot(history_1_grayscale.history['accuracy'])
plt.plot(history_1_grayscale.history['val_accuracy'])
plt.plot(history_1_rgb.history['accuracy'])
plt.plot(history_1_rgb.history['val_accuracy'])
plt.plot(history_2_grayscale.history['accuracy'])
plt.plot(history_2_grayscale.history['val_accuracy'])
plt.plot(history_2_rgb.history['accuracy'])
plt.plot(history_2_rgb.history['val_accuracy'])
plt.title('Accuracy - Models 1 & 2 (Grayscale & RGB)' )
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Training Accuracy - Model 1 (Grayscale)',
'Validation Accuracy - Model 1 (Grayscale)',
'Training Accuracy - Model 1 (RGB)',
'Validation Accuracy - Model 1 (RGB)',
'Training Accuracy - Model 2 (Grayscale)',
'Validation Accuracy - Model 2 (Grayscale)',
'Training Accuracy - Model 2 (RGB)',
'Validation Accuracy - Model 2 (RGB)'], loc='lower right')
plt.show()
In this section, we will create several Transfer Learning architectures. For the pre-trained models, we will select three popular architectures, namely: VGG16, ResNet v2, and Efficient Net. The difference between these architectures and the previous architectures is that these will require 3 input channels (RGB) while the earlier models also worked on grayscale images.
We will create new data loaders for the transfer learning architectures used below. As required by the architectures we will be piggybacking, color_mode will be set to RGB.
Additionally, we will be using the same data augmentation methods used on our previous models in order to better compare performance against our baseline models. These methods include horizontal_flip, brightness_range, rescale, and shear_range.
batch_size = 32
# Creating ImageDataGenerator objects for RGB colormode
datagen_train_rgb = ImageDataGenerator(horizontal_flip = True,
brightness_range = (0.,2.),
rescale = 1./255,
shear_range = 0.3)
datagen_validation_rgb = ImageDataGenerator(horizontal_flip = True,
brightness_range = (0.,2.),
rescale = 1./255,
shear_range = 0.3)
datagen_test_rgb = ImageDataGenerator(horizontal_flip = True,
brightness_range = (0.,2.),
rescale = 1./255,
shear_range = 0.3)
# Creating train, validation, and test sets for RGB colormode
print("\nColor Images")
train_set_rgb = datagen_train_rgb.flow_from_directory(dir_train,
target_size = (img_size, img_size),
color_mode = "rgb",
batch_size = batch_size,
class_mode = 'categorical',
classes = ['happy', 'sad', 'neutral', 'surprise'],
seed = 42,
shuffle = True)
val_set_rgb = datagen_validation_rgb.flow_from_directory(dir_validation,
target_size = (img_size, img_size),
color_mode = "rgb",
batch_size = batch_size,
class_mode = 'categorical',
classes = ['happy', 'sad', 'neutral', 'surprise'],
seed = 42,
shuffle = False)
test_set_rgb = datagen_test_rgb.flow_from_directory(dir_test,
target_size = (img_size, img_size),
color_mode = "rgb",
batch_size = batch_size,
class_mode = 'categorical',
classes = ['happy', 'sad', 'neutral', 'surprise'],
seed = 42,
shuffle = False)
Color Images Found 15109 images belonging to 4 classes. Found 4977 images belonging to 4 classes. Found 128 images belonging to 4 classes.
First up is the VGG16 model, which is a CNN consisting of 13 convolutional layers, 5 MaxPooling layers, and 3 dense layers. The VGG16 model achieves nearly 93% accuracy on the ImageNet dataset containing 14 million images across 1,000 classes. Clearly, this is much more substantial than our models above.
vgg = VGG16(include_top = False, weights = 'imagenet', input_shape = (48, 48, 3))
vgg.summary()
Metal device set to: Apple M1 Pro
Model: "vgg16"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, 48, 48, 3)] 0
block1_conv1 (Conv2D) (None, 48, 48, 64) 1792
block1_conv2 (Conv2D) (None, 48, 48, 64) 36928
block1_pool (MaxPooling2D) (None, 24, 24, 64) 0
block2_conv1 (Conv2D) (None, 24, 24, 128) 73856
block2_conv2 (Conv2D) (None, 24, 24, 128) 147584
block2_pool (MaxPooling2D) (None, 12, 12, 128) 0
block3_conv1 (Conv2D) (None, 12, 12, 256) 295168
block3_conv2 (Conv2D) (None, 12, 12, 256) 590080
block3_conv3 (Conv2D) (None, 12, 12, 256) 590080
block3_pool (MaxPooling2D) (None, 6, 6, 256) 0
block4_conv1 (Conv2D) (None, 6, 6, 512) 1180160
block4_conv2 (Conv2D) (None, 6, 6, 512) 2359808
block4_conv3 (Conv2D) (None, 6, 6, 512) 2359808
block4_pool (MaxPooling2D) (None, 3, 3, 512) 0
block5_conv1 (Conv2D) (None, 3, 3, 512) 2359808
block5_conv2 (Conv2D) (None, 3, 3, 512) 2359808
block5_conv3 (Conv2D) (None, 3, 3, 512) 2359808
block5_pool (MaxPooling2D) (None, 1, 1, 512) 0
=================================================================
Total params: 14,714,688
Trainable params: 14,714,688
Non-trainable params: 0
_________________________________________________________________
We have imported the VGG16 model up to layer 'block4_pool', as this has shown the best performance compared to other layers (discussed below). The VGG16 layers will be frozen, so the only trainable layers will be those we add ourselves. After flattening the input from 'block4_pool', 2 dense layers will be added, followed by a Dropout layer, another dense layer, and BatchNormalization. We will end with a softmax classifier.
transfer_layer = vgg.get_layer('block4_pool')
vgg.trainable = False
# Flatten the input
x = Flatten()(transfer_layer.output)
# Dense layers
x = Dense(256, activation='relu')(x)
x = Dense(128, activation='relu')(x)
x = Dropout(0.2)(x)
x = Dense(64, activation='relu')(x)
x = BatchNormalization()(x)
# Classifier
pred = Dense(4, activation='softmax')(x)
# Initialize the model
model_3 = Model(vgg.input, pred)
# Creating a checkpoint which saves model weights from the best epoch
checkpoint = ModelCheckpoint('./model_3.h5', monitor='val_accuracy', verbose=1, save_best_only=True, mode='auto')
# Initiates early stopping if validation loss does not continue to improve
early_stopping = EarlyStopping(monitor = 'val_loss',
min_delta = 0,
patience = 15, # This is increased compared to initial models, otherwise training is cut too quickly
verbose = 1,
restore_best_weights = True)
# Initiates reduced learning rate if validation loss does not continue to improve
reduce_learningrate = ReduceLROnPlateau(monitor = 'val_loss',
factor = 0.2,
patience = 3,
verbose = 1,
min_delta = 0.0001)
callbacks_list = [checkpoint, early_stopping, reduce_learningrate]
# Compiling model with optimizer set to Adam, loss set to categorical_crossentropy, and metrics set to accuracy
model_3.compile(optimizer = Adam(learning_rate = 0.001), loss = 'categorical_crossentropy', metrics = ['accuracy'])
# Fitting model with epochs set to 100
history_3 = model_3.fit(train_set_rgb, validation_data = val_set_rgb, epochs = 100, callbacks = callbacks_list)
Epoch 1/100 472/473 [============================>.] - ETA: 0s - loss: 1.1567 - accuracy: 0.4857 Epoch 1: val_accuracy improved from -inf to 0.58368, saving model to ./model_3.h5 473/473 [==============================] - 27s 56ms/step - loss: 1.1565 - accuracy: 0.4859 - val_loss: 0.9624 - val_accuracy: 0.5837 - lr: 0.0010 Epoch 2/100 472/473 [============================>.] - ETA: 0s - loss: 0.9959 - accuracy: 0.5696 Epoch 2: val_accuracy improved from 0.58368 to 0.60478, saving model to ./model_3.h5 473/473 [==============================] - 26s 55ms/step - loss: 0.9956 - accuracy: 0.5695 - val_loss: 0.9489 - val_accuracy: 0.6048 - lr: 0.0010 Epoch 3/100 473/473 [==============================] - ETA: 0s - loss: 0.9465 - accuracy: 0.5975 Epoch 3: val_accuracy did not improve from 0.60478 473/473 [==============================] - 25s 52ms/step - loss: 0.9465 - accuracy: 0.5975 - val_loss: 0.9521 - val_accuracy: 0.5823 - lr: 0.0010 Epoch 4/100 472/473 [============================>.] - ETA: 0s - loss: 0.9283 - accuracy: 0.6028 Epoch 4: val_accuracy improved from 0.60478 to 0.61583, saving model to ./model_3.h5 473/473 [==============================] - 24s 52ms/step - loss: 0.9283 - accuracy: 0.6029 - val_loss: 0.8993 - val_accuracy: 0.6158 - lr: 0.0010 Epoch 5/100 472/473 [============================>.] - ETA: 0s - loss: 0.8961 - accuracy: 0.6238 Epoch 5: val_accuracy did not improve from 0.61583 473/473 [==============================] - 25s 53ms/step - loss: 0.8965 - accuracy: 0.6237 - val_loss: 1.0021 - val_accuracy: 0.5773 - lr: 0.0010 Epoch 6/100 473/473 [==============================] - ETA: 0s - loss: 0.8771 - accuracy: 0.6313 Epoch 6: val_accuracy did not improve from 0.61583 473/473 [==============================] - 25s 53ms/step - loss: 0.8771 - accuracy: 0.6313 - val_loss: 0.9191 - val_accuracy: 0.6106 - lr: 0.0010 Epoch 7/100 473/473 [==============================] - ETA: 0s - loss: 0.8616 - accuracy: 0.6366 Epoch 7: val_accuracy did not improve from 0.61583 Epoch 7: ReduceLROnPlateau reducing learning rate to 0.00020000000949949026. 473/473 [==============================] - 25s 54ms/step - loss: 0.8616 - accuracy: 0.6366 - val_loss: 0.9665 - val_accuracy: 0.5881 - lr: 0.0010 Epoch 8/100 473/473 [==============================] - ETA: 0s - loss: 0.8029 - accuracy: 0.6680 Epoch 8: val_accuracy improved from 0.61583 to 0.64296, saving model to ./model_3.h5 473/473 [==============================] - 25s 52ms/step - loss: 0.8029 - accuracy: 0.6680 - val_loss: 0.8538 - val_accuracy: 0.6430 - lr: 2.0000e-04 Epoch 9/100 473/473 [==============================] - ETA: 0s - loss: 0.7846 - accuracy: 0.6760 Epoch 9: val_accuracy improved from 0.64296 to 0.64678, saving model to ./model_3.h5 473/473 [==============================] - 25s 53ms/step - loss: 0.7846 - accuracy: 0.6760 - val_loss: 0.8257 - val_accuracy: 0.6468 - lr: 2.0000e-04 Epoch 10/100 473/473 [==============================] - ETA: 0s - loss: 0.7748 - accuracy: 0.6801 Epoch 10: val_accuracy improved from 0.64678 to 0.65963, saving model to ./model_3.h5 473/473 [==============================] - 25s 53ms/step - loss: 0.7748 - accuracy: 0.6801 - val_loss: 0.8253 - val_accuracy: 0.6596 - lr: 2.0000e-04 Epoch 11/100 472/473 [============================>.] - ETA: 0s - loss: 0.7619 - accuracy: 0.6875 Epoch 11: val_accuracy did not improve from 0.65963 473/473 [==============================] - 25s 53ms/step - loss: 0.7625 - accuracy: 0.6875 - val_loss: 0.8434 - val_accuracy: 0.6482 - lr: 2.0000e-04 Epoch 12/100 472/473 [============================>.] - ETA: 0s - loss: 0.7511 - accuracy: 0.6946 Epoch 12: val_accuracy did not improve from 0.65963 473/473 [==============================] - 26s 56ms/step - loss: 0.7517 - accuracy: 0.6942 - val_loss: 0.8930 - val_accuracy: 0.6279 - lr: 2.0000e-04 Epoch 13/100 472/473 [============================>.] - ETA: 0s - loss: 0.7447 - accuracy: 0.6993 Epoch 13: val_accuracy did not improve from 0.65963 Epoch 13: ReduceLROnPlateau reducing learning rate to 4.0000001899898055e-05. 473/473 [==============================] - 25s 53ms/step - loss: 0.7449 - accuracy: 0.6992 - val_loss: 0.8513 - val_accuracy: 0.6446 - lr: 2.0000e-04 Epoch 14/100 473/473 [==============================] - ETA: 0s - loss: 0.7301 - accuracy: 0.7020 Epoch 14: val_accuracy improved from 0.65963 to 0.66325, saving model to ./model_3.h5 473/473 [==============================] - 25s 53ms/step - loss: 0.7301 - accuracy: 0.7020 - val_loss: 0.8147 - val_accuracy: 0.6633 - lr: 4.0000e-05 Epoch 15/100 472/473 [============================>.] - ETA: 0s - loss: 0.7166 - accuracy: 0.7120 Epoch 15: val_accuracy did not improve from 0.66325 473/473 [==============================] - 25s 53ms/step - loss: 0.7168 - accuracy: 0.7119 - val_loss: 0.8271 - val_accuracy: 0.6612 - lr: 4.0000e-05 Epoch 16/100 473/473 [==============================] - ETA: 0s - loss: 0.7163 - accuracy: 0.7128 Epoch 16: val_accuracy did not improve from 0.66325 473/473 [==============================] - 25s 54ms/step - loss: 0.7163 - accuracy: 0.7128 - val_loss: 0.8405 - val_accuracy: 0.6550 - lr: 4.0000e-05 Epoch 17/100 472/473 [============================>.] - ETA: 0s - loss: 0.7055 - accuracy: 0.7178 Epoch 17: val_accuracy improved from 0.66325 to 0.66727, saving model to ./model_3.h5 Epoch 17: ReduceLROnPlateau reducing learning rate to 8.000000525498762e-06. 473/473 [==============================] - 25s 52ms/step - loss: 0.7055 - accuracy: 0.7178 - val_loss: 0.8243 - val_accuracy: 0.6673 - lr: 4.0000e-05 Epoch 18/100 472/473 [============================>.] - ETA: 0s - loss: 0.7059 - accuracy: 0.7172 Epoch 18: val_accuracy did not improve from 0.66727 473/473 [==============================] - 25s 53ms/step - loss: 0.7059 - accuracy: 0.7173 - val_loss: 0.8273 - val_accuracy: 0.6641 - lr: 8.0000e-06 Epoch 19/100 473/473 [==============================] - ETA: 0s - loss: 0.7080 - accuracy: 0.7122 Epoch 19: val_accuracy did not improve from 0.66727 473/473 [==============================] - 25s 52ms/step - loss: 0.7080 - accuracy: 0.7122 - val_loss: 0.8163 - val_accuracy: 0.6641 - lr: 8.0000e-06 Epoch 20/100 472/473 [============================>.] - ETA: 0s - loss: 0.7017 - accuracy: 0.7190 Epoch 20: val_accuracy did not improve from 0.66727 Epoch 20: ReduceLROnPlateau reducing learning rate to 1.6000001778593287e-06. 473/473 [==============================] - 25s 52ms/step - loss: 0.7014 - accuracy: 0.7191 - val_loss: 0.8168 - val_accuracy: 0.6661 - lr: 8.0000e-06 Epoch 21/100 473/473 [==============================] - ETA: 0s - loss: 0.7071 - accuracy: 0.7149 Epoch 21: val_accuracy did not improve from 0.66727 473/473 [==============================] - 25s 54ms/step - loss: 0.7071 - accuracy: 0.7149 - val_loss: 0.8214 - val_accuracy: 0.6548 - lr: 1.6000e-06 Epoch 22/100 472/473 [============================>.] - ETA: 0s - loss: 0.7026 - accuracy: 0.7123 Epoch 22: val_accuracy did not improve from 0.66727 473/473 [==============================] - 25s 53ms/step - loss: 0.7029 - accuracy: 0.7122 - val_loss: 0.8246 - val_accuracy: 0.6612 - lr: 1.6000e-06 Epoch 23/100 472/473 [============================>.] - ETA: 0s - loss: 0.6989 - accuracy: 0.7184 Epoch 23: val_accuracy did not improve from 0.66727 Epoch 23: ReduceLROnPlateau reducing learning rate to 3.200000264769187e-07. 473/473 [==============================] - 24s 52ms/step - loss: 0.6986 - accuracy: 0.7186 - val_loss: 0.8369 - val_accuracy: 0.6604 - lr: 1.6000e-06 Epoch 24/100 473/473 [==============================] - ETA: 0s - loss: 0.7066 - accuracy: 0.7187 Epoch 24: val_accuracy did not improve from 0.66727 473/473 [==============================] - 25s 53ms/step - loss: 0.7066 - accuracy: 0.7187 - val_loss: 0.8291 - val_accuracy: 0.6526 - lr: 3.2000e-07 Epoch 25/100 473/473 [==============================] - ETA: 0s - loss: 0.7011 - accuracy: 0.7216 Epoch 25: val_accuracy did not improve from 0.66727 473/473 [==============================] - 26s 54ms/step - loss: 0.7011 - accuracy: 0.7216 - val_loss: 0.8222 - val_accuracy: 0.6624 - lr: 3.2000e-07 Epoch 26/100 473/473 [==============================] - ETA: 0s - loss: 0.6997 - accuracy: 0.7165 Epoch 26: val_accuracy did not improve from 0.66727 473/473 [==============================] - 25s 52ms/step - loss: 0.6997 - accuracy: 0.7165 - val_loss: 0.8136 - val_accuracy: 0.6641 - lr: 3.2000e-07 Epoch 27/100 472/473 [============================>.] - ETA: 0s - loss: 0.7075 - accuracy: 0.7138 Epoch 27: val_accuracy improved from 0.66727 to 0.66787, saving model to ./model_3.h5 473/473 [==============================] - 25s 52ms/step - loss: 0.7071 - accuracy: 0.7139 - val_loss: 0.8259 - val_accuracy: 0.6679 - lr: 3.2000e-07 Epoch 28/100 472/473 [============================>.] - ETA: 0s - loss: 0.7100 - accuracy: 0.7143 Epoch 28: val_accuracy did not improve from 0.66787 473/473 [==============================] - 25s 52ms/step - loss: 0.7100 - accuracy: 0.7142 - val_loss: 0.8182 - val_accuracy: 0.6659 - lr: 3.2000e-07 Epoch 29/100 472/473 [============================>.] - ETA: 0s - loss: 0.7110 - accuracy: 0.7148 Epoch 29: val_accuracy improved from 0.66787 to 0.66868, saving model to ./model_3.h5 473/473 [==============================] - 25s 54ms/step - loss: 0.7107 - accuracy: 0.7150 - val_loss: 0.8042 - val_accuracy: 0.6687 - lr: 3.2000e-07 Epoch 30/100 472/473 [============================>.] - ETA: 0s - loss: 0.7077 - accuracy: 0.7147 Epoch 30: val_accuracy did not improve from 0.66868 473/473 [==============================] - 25s 53ms/step - loss: 0.7076 - accuracy: 0.7147 - val_loss: 0.8247 - val_accuracy: 0.6628 - lr: 3.2000e-07 Epoch 31/100 473/473 [==============================] - ETA: 0s - loss: 0.7021 - accuracy: 0.7169 Epoch 31: val_accuracy did not improve from 0.66868 473/473 [==============================] - 25s 52ms/step - loss: 0.7021 - accuracy: 0.7169 - val_loss: 0.8272 - val_accuracy: 0.6626 - lr: 3.2000e-07 Epoch 32/100 473/473 [==============================] - ETA: 0s - loss: 0.7035 - accuracy: 0.7140 Epoch 32: val_accuracy did not improve from 0.66868 Epoch 32: ReduceLROnPlateau reducing learning rate to 6.400000529538374e-08. 473/473 [==============================] - 25s 53ms/step - loss: 0.7035 - accuracy: 0.7140 - val_loss: 0.8372 - val_accuracy: 0.6608 - lr: 3.2000e-07 Epoch 33/100 473/473 [==============================] - ETA: 0s - loss: 0.7164 - accuracy: 0.7110 Epoch 33: val_accuracy did not improve from 0.66868 473/473 [==============================] - 25s 52ms/step - loss: 0.7164 - accuracy: 0.7110 - val_loss: 0.8113 - val_accuracy: 0.6622 - lr: 6.4000e-08 Epoch 34/100 473/473 [==============================] - ETA: 0s - loss: 0.7049 - accuracy: 0.7217 Epoch 34: val_accuracy did not improve from 0.66868 473/473 [==============================] - 26s 56ms/step - loss: 0.7049 - accuracy: 0.7217 - val_loss: 0.8284 - val_accuracy: 0.6653 - lr: 6.4000e-08 Epoch 35/100 472/473 [============================>.] - ETA: 0s - loss: 0.6998 - accuracy: 0.7208 Epoch 35: val_accuracy did not improve from 0.66868 Epoch 35: ReduceLROnPlateau reducing learning rate to 1.2800001059076749e-08. 473/473 [==============================] - 25s 53ms/step - loss: 0.6995 - accuracy: 0.7211 - val_loss: 0.8243 - val_accuracy: 0.6624 - lr: 6.4000e-08 Epoch 36/100 473/473 [==============================] - ETA: 0s - loss: 0.7031 - accuracy: 0.7179 Epoch 36: val_accuracy did not improve from 0.66868 473/473 [==============================] - 25s 53ms/step - loss: 0.7031 - accuracy: 0.7179 - val_loss: 0.8346 - val_accuracy: 0.6552 - lr: 1.2800e-08 Epoch 37/100 473/473 [==============================] - ETA: 0s - loss: 0.7026 - accuracy: 0.7196 Epoch 37: val_accuracy did not improve from 0.66868 473/473 [==============================] - 25s 52ms/step - loss: 0.7026 - accuracy: 0.7196 - val_loss: 0.8184 - val_accuracy: 0.6641 - lr: 1.2800e-08 Epoch 38/100 472/473 [============================>.] - ETA: 0s - loss: 0.6993 - accuracy: 0.7222 Epoch 38: val_accuracy did not improve from 0.66868 Epoch 38: ReduceLROnPlateau reducing learning rate to 2.5600002118153498e-09. 473/473 [==============================] - 25s 54ms/step - loss: 0.6993 - accuracy: 0.7222 - val_loss: 0.8251 - val_accuracy: 0.6556 - lr: 1.2800e-08 Epoch 39/100 472/473 [============================>.] - ETA: 0s - loss: 0.7074 - accuracy: 0.7116 Epoch 39: val_accuracy did not improve from 0.66868 473/473 [==============================] - 24s 51ms/step - loss: 0.7083 - accuracy: 0.7112 - val_loss: 0.8227 - val_accuracy: 0.6576 - lr: 2.5600e-09 Epoch 40/100 472/473 [============================>.] - ETA: 0s - loss: 0.7031 - accuracy: 0.7163 Epoch 40: val_accuracy did not improve from 0.66868 473/473 [==============================] - 24s 51ms/step - loss: 0.7029 - accuracy: 0.7163 - val_loss: 0.8238 - val_accuracy: 0.6651 - lr: 2.5600e-09 Epoch 41/100 472/473 [============================>.] - ETA: 0s - loss: 0.6988 - accuracy: 0.7201 Epoch 41: val_accuracy did not improve from 0.66868 Epoch 41: ReduceLROnPlateau reducing learning rate to 5.1200004236307e-10. 473/473 [==============================] - 25s 53ms/step - loss: 0.6992 - accuracy: 0.7200 - val_loss: 0.8195 - val_accuracy: 0.6628 - lr: 2.5600e-09 Epoch 42/100 473/473 [==============================] - ETA: 0s - loss: 0.7109 - accuracy: 0.7165 Epoch 42: val_accuracy did not improve from 0.66868 473/473 [==============================] - 26s 55ms/step - loss: 0.7109 - accuracy: 0.7165 - val_loss: 0.8230 - val_accuracy: 0.6659 - lr: 5.1200e-10 Epoch 43/100 473/473 [==============================] - ETA: 0s - loss: 0.6996 - accuracy: 0.7215 Epoch 43: val_accuracy did not improve from 0.66868 473/473 [==============================] - 25s 53ms/step - loss: 0.6996 - accuracy: 0.7215 - val_loss: 0.8386 - val_accuracy: 0.6566 - lr: 5.1200e-10 Epoch 44/100 473/473 [==============================] - ETA: 0s - loss: 0.7026 - accuracy: 0.7174 Epoch 44: val_accuracy did not improve from 0.66868 Restoring model weights from the end of the best epoch: 29. Epoch 44: ReduceLROnPlateau reducing learning rate to 1.0240001069306004e-10. 473/473 [==============================] - 25s 52ms/step - loss: 0.7026 - accuracy: 0.7174 - val_loss: 0.8289 - val_accuracy: 0.6626 - lr: 5.1200e-10 Epoch 44: early stopping
# Plotting the accuracies
plt.figure(figsize = (10, 5))
plt.plot(history_3.history['accuracy'])
plt.plot(history_3.history['val_accuracy'])
plt.title('Accuracy - VGG16 Model')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Training', 'Validation'], loc='lower right')
plt.show()
# Plotting the losses
plt.figure(figsize = (10, 5))
plt.plot(history_3.history['loss'])
plt.plot(history_3.history['val_loss'])
plt.title('Loss - VGG16 Model')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['Training', 'Validation'], loc='upper right')
plt.show()
# Evaluating the model's performance on the test set
accuracy = model_3.evaluate(test_set_rgb)
4/4 [==============================] - 0s 40ms/step - loss: 0.7386 - accuracy: 0.6562
Observations and Insights:
As imported and modified, our transfer learning model seems to perform similarly to our previous models developed above. After 29 epochs (best epoch), training accuracy stands at 0.72 and validation accuracy is 0.67. Accuracy and loss for both the training and validation data level off before early stopping ends the training. The model's performance on the test data stands at 0.66. These scores are roughly in line with the scores of Model 1, our baseline model.
The VGG16 model was ultimately imported up to layer block4_pool, as it produced the best performance. A history of alternative models is below.
| Train Loss | Train Accuracy | Val Loss | Val Accuracy | |
|---|---|---|---|---|
| VGG16 block4_pool (selected) | 0.71 | 0.72 | 0.80 | 0.67 |
| VGG16 block5_pool | 1.05 | 0.54 | 1.10 | 0.52 |
| VGG16 block3_pool | 0.79 | 0.69 | 0.77 | 0.66 |
| VGG16 block2_pool | 0.71 | 0.71 | 0.82 | 0.65 |
Our second transfer learning model is ResNet v2, which is a CNN trained on over 1 million images from the ImageNet database. ResNet v2 can classify images into 1,000 different categories. Like VGG16, colormode must be set to RGB to leverage this pre-trained architecture.
Resnet = ap.ResNet101(include_top = False, weights = "imagenet", input_shape=(48,48,3))
Resnet.summary()
Model: "resnet101"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_2 (InputLayer) [(None, 48, 48, 3)] 0 []
conv1_pad (ZeroPadding2D) (None, 54, 54, 3) 0 ['input_2[0][0]']
conv1_conv (Conv2D) (None, 24, 24, 64) 9472 ['conv1_pad[0][0]']
conv1_bn (BatchNormalization) (None, 24, 24, 64) 256 ['conv1_conv[0][0]']
conv1_relu (Activation) (None, 24, 24, 64) 0 ['conv1_bn[0][0]']
pool1_pad (ZeroPadding2D) (None, 26, 26, 64) 0 ['conv1_relu[0][0]']
pool1_pool (MaxPooling2D) (None, 12, 12, 64) 0 ['pool1_pad[0][0]']
conv2_block1_1_conv (Conv2D) (None, 12, 12, 64) 4160 ['pool1_pool[0][0]']
conv2_block1_1_bn (BatchNormal (None, 12, 12, 64) 256 ['conv2_block1_1_conv[0][0]']
ization)
conv2_block1_1_relu (Activatio (None, 12, 12, 64) 0 ['conv2_block1_1_bn[0][0]']
n)
conv2_block1_2_conv (Conv2D) (None, 12, 12, 64) 36928 ['conv2_block1_1_relu[0][0]']
conv2_block1_2_bn (BatchNormal (None, 12, 12, 64) 256 ['conv2_block1_2_conv[0][0]']
ization)
conv2_block1_2_relu (Activatio (None, 12, 12, 64) 0 ['conv2_block1_2_bn[0][0]']
n)
conv2_block1_0_conv (Conv2D) (None, 12, 12, 256) 16640 ['pool1_pool[0][0]']
conv2_block1_3_conv (Conv2D) (None, 12, 12, 256) 16640 ['conv2_block1_2_relu[0][0]']
conv2_block1_0_bn (BatchNormal (None, 12, 12, 256) 1024 ['conv2_block1_0_conv[0][0]']
ization)
conv2_block1_3_bn (BatchNormal (None, 12, 12, 256) 1024 ['conv2_block1_3_conv[0][0]']
ization)
conv2_block1_add (Add) (None, 12, 12, 256) 0 ['conv2_block1_0_bn[0][0]',
'conv2_block1_3_bn[0][0]']
conv2_block1_out (Activation) (None, 12, 12, 256) 0 ['conv2_block1_add[0][0]']
conv2_block2_1_conv (Conv2D) (None, 12, 12, 64) 16448 ['conv2_block1_out[0][0]']
conv2_block2_1_bn (BatchNormal (None, 12, 12, 64) 256 ['conv2_block2_1_conv[0][0]']
ization)
conv2_block2_1_relu (Activatio (None, 12, 12, 64) 0 ['conv2_block2_1_bn[0][0]']
n)
conv2_block2_2_conv (Conv2D) (None, 12, 12, 64) 36928 ['conv2_block2_1_relu[0][0]']
conv2_block2_2_bn (BatchNormal (None, 12, 12, 64) 256 ['conv2_block2_2_conv[0][0]']
ization)
conv2_block2_2_relu (Activatio (None, 12, 12, 64) 0 ['conv2_block2_2_bn[0][0]']
n)
conv2_block2_3_conv (Conv2D) (None, 12, 12, 256) 16640 ['conv2_block2_2_relu[0][0]']
conv2_block2_3_bn (BatchNormal (None, 12, 12, 256) 1024 ['conv2_block2_3_conv[0][0]']
ization)
conv2_block2_add (Add) (None, 12, 12, 256) 0 ['conv2_block1_out[0][0]',
'conv2_block2_3_bn[0][0]']
conv2_block2_out (Activation) (None, 12, 12, 256) 0 ['conv2_block2_add[0][0]']
conv2_block3_1_conv (Conv2D) (None, 12, 12, 64) 16448 ['conv2_block2_out[0][0]']
conv2_block3_1_bn (BatchNormal (None, 12, 12, 64) 256 ['conv2_block3_1_conv[0][0]']
ization)
conv2_block3_1_relu (Activatio (None, 12, 12, 64) 0 ['conv2_block3_1_bn[0][0]']
n)
conv2_block3_2_conv (Conv2D) (None, 12, 12, 64) 36928 ['conv2_block3_1_relu[0][0]']
conv2_block3_2_bn (BatchNormal (None, 12, 12, 64) 256 ['conv2_block3_2_conv[0][0]']
ization)
conv2_block3_2_relu (Activatio (None, 12, 12, 64) 0 ['conv2_block3_2_bn[0][0]']
n)
conv2_block3_3_conv (Conv2D) (None, 12, 12, 256) 16640 ['conv2_block3_2_relu[0][0]']
conv2_block3_3_bn (BatchNormal (None, 12, 12, 256) 1024 ['conv2_block3_3_conv[0][0]']
ization)
conv2_block3_add (Add) (None, 12, 12, 256) 0 ['conv2_block2_out[0][0]',
'conv2_block3_3_bn[0][0]']
conv2_block3_out (Activation) (None, 12, 12, 256) 0 ['conv2_block3_add[0][0]']
conv3_block1_1_conv (Conv2D) (None, 6, 6, 128) 32896 ['conv2_block3_out[0][0]']
conv3_block1_1_bn (BatchNormal (None, 6, 6, 128) 512 ['conv3_block1_1_conv[0][0]']
ization)
conv3_block1_1_relu (Activatio (None, 6, 6, 128) 0 ['conv3_block1_1_bn[0][0]']
n)
conv3_block1_2_conv (Conv2D) (None, 6, 6, 128) 147584 ['conv3_block1_1_relu[0][0]']
conv3_block1_2_bn (BatchNormal (None, 6, 6, 128) 512 ['conv3_block1_2_conv[0][0]']
ization)
conv3_block1_2_relu (Activatio (None, 6, 6, 128) 0 ['conv3_block1_2_bn[0][0]']
n)
conv3_block1_0_conv (Conv2D) (None, 6, 6, 512) 131584 ['conv2_block3_out[0][0]']
conv3_block1_3_conv (Conv2D) (None, 6, 6, 512) 66048 ['conv3_block1_2_relu[0][0]']
conv3_block1_0_bn (BatchNormal (None, 6, 6, 512) 2048 ['conv3_block1_0_conv[0][0]']
ization)
conv3_block1_3_bn (BatchNormal (None, 6, 6, 512) 2048 ['conv3_block1_3_conv[0][0]']
ization)
conv3_block1_add (Add) (None, 6, 6, 512) 0 ['conv3_block1_0_bn[0][0]',
'conv3_block1_3_bn[0][0]']
conv3_block1_out (Activation) (None, 6, 6, 512) 0 ['conv3_block1_add[0][0]']
conv3_block2_1_conv (Conv2D) (None, 6, 6, 128) 65664 ['conv3_block1_out[0][0]']
conv3_block2_1_bn (BatchNormal (None, 6, 6, 128) 512 ['conv3_block2_1_conv[0][0]']
ization)
conv3_block2_1_relu (Activatio (None, 6, 6, 128) 0 ['conv3_block2_1_bn[0][0]']
n)
conv3_block2_2_conv (Conv2D) (None, 6, 6, 128) 147584 ['conv3_block2_1_relu[0][0]']
conv3_block2_2_bn (BatchNormal (None, 6, 6, 128) 512 ['conv3_block2_2_conv[0][0]']
ization)
conv3_block2_2_relu (Activatio (None, 6, 6, 128) 0 ['conv3_block2_2_bn[0][0]']
n)
conv3_block2_3_conv (Conv2D) (None, 6, 6, 512) 66048 ['conv3_block2_2_relu[0][0]']
conv3_block2_3_bn (BatchNormal (None, 6, 6, 512) 2048 ['conv3_block2_3_conv[0][0]']
ization)
conv3_block2_add (Add) (None, 6, 6, 512) 0 ['conv3_block1_out[0][0]',
'conv3_block2_3_bn[0][0]']
conv3_block2_out (Activation) (None, 6, 6, 512) 0 ['conv3_block2_add[0][0]']
conv3_block3_1_conv (Conv2D) (None, 6, 6, 128) 65664 ['conv3_block2_out[0][0]']
conv3_block3_1_bn (BatchNormal (None, 6, 6, 128) 512 ['conv3_block3_1_conv[0][0]']
ization)
conv3_block3_1_relu (Activatio (None, 6, 6, 128) 0 ['conv3_block3_1_bn[0][0]']
n)
conv3_block3_2_conv (Conv2D) (None, 6, 6, 128) 147584 ['conv3_block3_1_relu[0][0]']
conv3_block3_2_bn (BatchNormal (None, 6, 6, 128) 512 ['conv3_block3_2_conv[0][0]']
ization)
conv3_block3_2_relu (Activatio (None, 6, 6, 128) 0 ['conv3_block3_2_bn[0][0]']
n)
conv3_block3_3_conv (Conv2D) (None, 6, 6, 512) 66048 ['conv3_block3_2_relu[0][0]']
conv3_block3_3_bn (BatchNormal (None, 6, 6, 512) 2048 ['conv3_block3_3_conv[0][0]']
ization)
conv3_block3_add (Add) (None, 6, 6, 512) 0 ['conv3_block2_out[0][0]',
'conv3_block3_3_bn[0][0]']
conv3_block3_out (Activation) (None, 6, 6, 512) 0 ['conv3_block3_add[0][0]']
conv3_block4_1_conv (Conv2D) (None, 6, 6, 128) 65664 ['conv3_block3_out[0][0]']
conv3_block4_1_bn (BatchNormal (None, 6, 6, 128) 512 ['conv3_block4_1_conv[0][0]']
ization)
conv3_block4_1_relu (Activatio (None, 6, 6, 128) 0 ['conv3_block4_1_bn[0][0]']
n)
conv3_block4_2_conv (Conv2D) (None, 6, 6, 128) 147584 ['conv3_block4_1_relu[0][0]']
conv3_block4_2_bn (BatchNormal (None, 6, 6, 128) 512 ['conv3_block4_2_conv[0][0]']
ization)
conv3_block4_2_relu (Activatio (None, 6, 6, 128) 0 ['conv3_block4_2_bn[0][0]']
n)
conv3_block4_3_conv (Conv2D) (None, 6, 6, 512) 66048 ['conv3_block4_2_relu[0][0]']
conv3_block4_3_bn (BatchNormal (None, 6, 6, 512) 2048 ['conv3_block4_3_conv[0][0]']
ization)
conv3_block4_add (Add) (None, 6, 6, 512) 0 ['conv3_block3_out[0][0]',
'conv3_block4_3_bn[0][0]']
conv3_block4_out (Activation) (None, 6, 6, 512) 0 ['conv3_block4_add[0][0]']
conv4_block1_1_conv (Conv2D) (None, 3, 3, 256) 131328 ['conv3_block4_out[0][0]']
conv4_block1_1_bn (BatchNormal (None, 3, 3, 256) 1024 ['conv4_block1_1_conv[0][0]']
ization)
conv4_block1_1_relu (Activatio (None, 3, 3, 256) 0 ['conv4_block1_1_bn[0][0]']
n)
conv4_block1_2_conv (Conv2D) (None, 3, 3, 256) 590080 ['conv4_block1_1_relu[0][0]']
conv4_block1_2_bn (BatchNormal (None, 3, 3, 256) 1024 ['conv4_block1_2_conv[0][0]']
ization)
conv4_block1_2_relu (Activatio (None, 3, 3, 256) 0 ['conv4_block1_2_bn[0][0]']
n)
conv4_block1_0_conv (Conv2D) (None, 3, 3, 1024) 525312 ['conv3_block4_out[0][0]']
conv4_block1_3_conv (Conv2D) (None, 3, 3, 1024) 263168 ['conv4_block1_2_relu[0][0]']
conv4_block1_0_bn (BatchNormal (None, 3, 3, 1024) 4096 ['conv4_block1_0_conv[0][0]']
ization)
conv4_block1_3_bn (BatchNormal (None, 3, 3, 1024) 4096 ['conv4_block1_3_conv[0][0]']
ization)
conv4_block1_add (Add) (None, 3, 3, 1024) 0 ['conv4_block1_0_bn[0][0]',
'conv4_block1_3_bn[0][0]']
conv4_block1_out (Activation) (None, 3, 3, 1024) 0 ['conv4_block1_add[0][0]']
conv4_block2_1_conv (Conv2D) (None, 3, 3, 256) 262400 ['conv4_block1_out[0][0]']
conv4_block2_1_bn (BatchNormal (None, 3, 3, 256) 1024 ['conv4_block2_1_conv[0][0]']
ization)
conv4_block2_1_relu (Activatio (None, 3, 3, 256) 0 ['conv4_block2_1_bn[0][0]']
n)
conv4_block2_2_conv (Conv2D) (None, 3, 3, 256) 590080 ['conv4_block2_1_relu[0][0]']
conv4_block2_2_bn (BatchNormal (None, 3, 3, 256) 1024 ['conv4_block2_2_conv[0][0]']
ization)
conv4_block2_2_relu (Activatio (None, 3, 3, 256) 0 ['conv4_block2_2_bn[0][0]']
n)
conv4_block2_3_conv (Conv2D) (None, 3, 3, 1024) 263168 ['conv4_block2_2_relu[0][0]']
conv4_block2_3_bn (BatchNormal (None, 3, 3, 1024) 4096 ['conv4_block2_3_conv[0][0]']
ization)
conv4_block2_add (Add) (None, 3, 3, 1024) 0 ['conv4_block1_out[0][0]',
'conv4_block2_3_bn[0][0]']
conv4_block2_out (Activation) (None, 3, 3, 1024) 0 ['conv4_block2_add[0][0]']
conv4_block3_1_conv (Conv2D) (None, 3, 3, 256) 262400 ['conv4_block2_out[0][0]']
conv4_block3_1_bn (BatchNormal (None, 3, 3, 256) 1024 ['conv4_block3_1_conv[0][0]']
ization)
conv4_block3_1_relu (Activatio (None, 3, 3, 256) 0 ['conv4_block3_1_bn[0][0]']
n)
conv4_block3_2_conv (Conv2D) (None, 3, 3, 256) 590080 ['conv4_block3_1_relu[0][0]']
conv4_block3_2_bn (BatchNormal (None, 3, 3, 256) 1024 ['conv4_block3_2_conv[0][0]']
ization)
conv4_block3_2_relu (Activatio (None, 3, 3, 256) 0 ['conv4_block3_2_bn[0][0]']
n)
conv4_block3_3_conv (Conv2D) (None, 3, 3, 1024) 263168 ['conv4_block3_2_relu[0][0]']
conv4_block3_3_bn (BatchNormal (None, 3, 3, 1024) 4096 ['conv4_block3_3_conv[0][0]']
ization)
conv4_block3_add (Add) (None, 3, 3, 1024) 0 ['conv4_block2_out[0][0]',
'conv4_block3_3_bn[0][0]']
conv4_block3_out (Activation) (None, 3, 3, 1024) 0 ['conv4_block3_add[0][0]']
conv4_block4_1_conv (Conv2D) (None, 3, 3, 256) 262400 ['conv4_block3_out[0][0]']
conv4_block4_1_bn (BatchNormal (None, 3, 3, 256) 1024 ['conv4_block4_1_conv[0][0]']
ization)
conv4_block4_1_relu (Activatio (None, 3, 3, 256) 0 ['conv4_block4_1_bn[0][0]']
n)
conv4_block4_2_conv (Conv2D) (None, 3, 3, 256) 590080 ['conv4_block4_1_relu[0][0]']
conv4_block4_2_bn (BatchNormal (None, 3, 3, 256) 1024 ['conv4_block4_2_conv[0][0]']
ization)
conv4_block4_2_relu (Activatio (None, 3, 3, 256) 0 ['conv4_block4_2_bn[0][0]']
n)
conv4_block4_3_conv (Conv2D) (None, 3, 3, 1024) 263168 ['conv4_block4_2_relu[0][0]']
conv4_block4_3_bn (BatchNormal (None, 3, 3, 1024) 4096 ['conv4_block4_3_conv[0][0]']
ization)
conv4_block4_add (Add) (None, 3, 3, 1024) 0 ['conv4_block3_out[0][0]',
'conv4_block4_3_bn[0][0]']
conv4_block4_out (Activation) (None, 3, 3, 1024) 0 ['conv4_block4_add[0][0]']
conv4_block5_1_conv (Conv2D) (None, 3, 3, 256) 262400 ['conv4_block4_out[0][0]']
conv4_block5_1_bn (BatchNormal (None, 3, 3, 256) 1024 ['conv4_block5_1_conv[0][0]']
ization)
conv4_block5_1_relu (Activatio (None, 3, 3, 256) 0 ['conv4_block5_1_bn[0][0]']
n)
conv4_block5_2_conv (Conv2D) (None, 3, 3, 256) 590080 ['conv4_block5_1_relu[0][0]']
conv4_block5_2_bn (BatchNormal (None, 3, 3, 256) 1024 ['conv4_block5_2_conv[0][0]']
ization)
conv4_block5_2_relu (Activatio (None, 3, 3, 256) 0 ['conv4_block5_2_bn[0][0]']
n)
conv4_block5_3_conv (Conv2D) (None, 3, 3, 1024) 263168 ['conv4_block5_2_relu[0][0]']
conv4_block5_3_bn (BatchNormal (None, 3, 3, 1024) 4096 ['conv4_block5_3_conv[0][0]']
ization)
conv4_block5_add (Add) (None, 3, 3, 1024) 0 ['conv4_block4_out[0][0]',
'conv4_block5_3_bn[0][0]']
conv4_block5_out (Activation) (None, 3, 3, 1024) 0 ['conv4_block5_add[0][0]']
conv4_block6_1_conv (Conv2D) (None, 3, 3, 256) 262400 ['conv4_block5_out[0][0]']
conv4_block6_1_bn (BatchNormal (None, 3, 3, 256) 1024 ['conv4_block6_1_conv[0][0]']
ization)
conv4_block6_1_relu (Activatio (None, 3, 3, 256) 0 ['conv4_block6_1_bn[0][0]']
n)
conv4_block6_2_conv (Conv2D) (None, 3, 3, 256) 590080 ['conv4_block6_1_relu[0][0]']
conv4_block6_2_bn (BatchNormal (None, 3, 3, 256) 1024 ['conv4_block6_2_conv[0][0]']
ization)
conv4_block6_2_relu (Activatio (None, 3, 3, 256) 0 ['conv4_block6_2_bn[0][0]']
n)
conv4_block6_3_conv (Conv2D) (None, 3, 3, 1024) 263168 ['conv4_block6_2_relu[0][0]']
conv4_block6_3_bn (BatchNormal (None, 3, 3, 1024) 4096 ['conv4_block6_3_conv[0][0]']
ization)
conv4_block6_add (Add) (None, 3, 3, 1024) 0 ['conv4_block5_out[0][0]',
'conv4_block6_3_bn[0][0]']
conv4_block6_out (Activation) (None, 3, 3, 1024) 0 ['conv4_block6_add[0][0]']
conv4_block7_1_conv (Conv2D) (None, 3, 3, 256) 262400 ['conv4_block6_out[0][0]']
conv4_block7_1_bn (BatchNormal (None, 3, 3, 256) 1024 ['conv4_block7_1_conv[0][0]']
ization)
conv4_block7_1_relu (Activatio (None, 3, 3, 256) 0 ['conv4_block7_1_bn[0][0]']
n)
conv4_block7_2_conv (Conv2D) (None, 3, 3, 256) 590080 ['conv4_block7_1_relu[0][0]']
conv4_block7_2_bn (BatchNormal (None, 3, 3, 256) 1024 ['conv4_block7_2_conv[0][0]']
ization)
conv4_block7_2_relu (Activatio (None, 3, 3, 256) 0 ['conv4_block7_2_bn[0][0]']
n)
conv4_block7_3_conv (Conv2D) (None, 3, 3, 1024) 263168 ['conv4_block7_2_relu[0][0]']
conv4_block7_3_bn (BatchNormal (None, 3, 3, 1024) 4096 ['conv4_block7_3_conv[0][0]']
ization)
conv4_block7_add (Add) (None, 3, 3, 1024) 0 ['conv4_block6_out[0][0]',
'conv4_block7_3_bn[0][0]']
conv4_block7_out (Activation) (None, 3, 3, 1024) 0 ['conv4_block7_add[0][0]']
conv4_block8_1_conv (Conv2D) (None, 3, 3, 256) 262400 ['conv4_block7_out[0][0]']
conv4_block8_1_bn (BatchNormal (None, 3, 3, 256) 1024 ['conv4_block8_1_conv[0][0]']
ization)
conv4_block8_1_relu (Activatio (None, 3, 3, 256) 0 ['conv4_block8_1_bn[0][0]']
n)
conv4_block8_2_conv (Conv2D) (None, 3, 3, 256) 590080 ['conv4_block8_1_relu[0][0]']
conv4_block8_2_bn (BatchNormal (None, 3, 3, 256) 1024 ['conv4_block8_2_conv[0][0]']
ization)
conv4_block8_2_relu (Activatio (None, 3, 3, 256) 0 ['conv4_block8_2_bn[0][0]']
n)
conv4_block8_3_conv (Conv2D) (None, 3, 3, 1024) 263168 ['conv4_block8_2_relu[0][0]']
conv4_block8_3_bn (BatchNormal (None, 3, 3, 1024) 4096 ['conv4_block8_3_conv[0][0]']
ization)
conv4_block8_add (Add) (None, 3, 3, 1024) 0 ['conv4_block7_out[0][0]',
'conv4_block8_3_bn[0][0]']
conv4_block8_out (Activation) (None, 3, 3, 1024) 0 ['conv4_block8_add[0][0]']
conv4_block9_1_conv (Conv2D) (None, 3, 3, 256) 262400 ['conv4_block8_out[0][0]']
conv4_block9_1_bn (BatchNormal (None, 3, 3, 256) 1024 ['conv4_block9_1_conv[0][0]']
ization)
conv4_block9_1_relu (Activatio (None, 3, 3, 256) 0 ['conv4_block9_1_bn[0][0]']
n)
conv4_block9_2_conv (Conv2D) (None, 3, 3, 256) 590080 ['conv4_block9_1_relu[0][0]']
conv4_block9_2_bn (BatchNormal (None, 3, 3, 256) 1024 ['conv4_block9_2_conv[0][0]']
ization)
conv4_block9_2_relu (Activatio (None, 3, 3, 256) 0 ['conv4_block9_2_bn[0][0]']
n)
conv4_block9_3_conv (Conv2D) (None, 3, 3, 1024) 263168 ['conv4_block9_2_relu[0][0]']
conv4_block9_3_bn (BatchNormal (None, 3, 3, 1024) 4096 ['conv4_block9_3_conv[0][0]']
ization)
conv4_block9_add (Add) (None, 3, 3, 1024) 0 ['conv4_block8_out[0][0]',
'conv4_block9_3_bn[0][0]']
conv4_block9_out (Activation) (None, 3, 3, 1024) 0 ['conv4_block9_add[0][0]']
conv4_block10_1_conv (Conv2D) (None, 3, 3, 256) 262400 ['conv4_block9_out[0][0]']
conv4_block10_1_bn (BatchNorma (None, 3, 3, 256) 1024 ['conv4_block10_1_conv[0][0]']
lization)
conv4_block10_1_relu (Activati (None, 3, 3, 256) 0 ['conv4_block10_1_bn[0][0]']
on)
conv4_block10_2_conv (Conv2D) (None, 3, 3, 256) 590080 ['conv4_block10_1_relu[0][0]']
conv4_block10_2_bn (BatchNorma (None, 3, 3, 256) 1024 ['conv4_block10_2_conv[0][0]']
lization)
conv4_block10_2_relu (Activati (None, 3, 3, 256) 0 ['conv4_block10_2_bn[0][0]']
on)
conv4_block10_3_conv (Conv2D) (None, 3, 3, 1024) 263168 ['conv4_block10_2_relu[0][0]']
conv4_block10_3_bn (BatchNorma (None, 3, 3, 1024) 4096 ['conv4_block10_3_conv[0][0]']
lization)
conv4_block10_add (Add) (None, 3, 3, 1024) 0 ['conv4_block9_out[0][0]',
'conv4_block10_3_bn[0][0]']
conv4_block10_out (Activation) (None, 3, 3, 1024) 0 ['conv4_block10_add[0][0]']
conv4_block11_1_conv (Conv2D) (None, 3, 3, 256) 262400 ['conv4_block10_out[0][0]']
conv4_block11_1_bn (BatchNorma (None, 3, 3, 256) 1024 ['conv4_block11_1_conv[0][0]']
lization)
conv4_block11_1_relu (Activati (None, 3, 3, 256) 0 ['conv4_block11_1_bn[0][0]']
on)
conv4_block11_2_conv (Conv2D) (None, 3, 3, 256) 590080 ['conv4_block11_1_relu[0][0]']
conv4_block11_2_bn (BatchNorma (None, 3, 3, 256) 1024 ['conv4_block11_2_conv[0][0]']
lization)
conv4_block11_2_relu (Activati (None, 3, 3, 256) 0 ['conv4_block11_2_bn[0][0]']
on)
conv4_block11_3_conv (Conv2D) (None, 3, 3, 1024) 263168 ['conv4_block11_2_relu[0][0]']
conv4_block11_3_bn (BatchNorma (None, 3, 3, 1024) 4096 ['conv4_block11_3_conv[0][0]']
lization)
conv4_block11_add (Add) (None, 3, 3, 1024) 0 ['conv4_block10_out[0][0]',
'conv4_block11_3_bn[0][0]']
conv4_block11_out (Activation) (None, 3, 3, 1024) 0 ['conv4_block11_add[0][0]']
conv4_block12_1_conv (Conv2D) (None, 3, 3, 256) 262400 ['conv4_block11_out[0][0]']
conv4_block12_1_bn (BatchNorma (None, 3, 3, 256) 1024 ['conv4_block12_1_conv[0][0]']
lization)
conv4_block12_1_relu (Activati (None, 3, 3, 256) 0 ['conv4_block12_1_bn[0][0]']
on)
conv4_block12_2_conv (Conv2D) (None, 3, 3, 256) 590080 ['conv4_block12_1_relu[0][0]']
conv4_block12_2_bn (BatchNorma (None, 3, 3, 256) 1024 ['conv4_block12_2_conv[0][0]']
lization)
conv4_block12_2_relu (Activati (None, 3, 3, 256) 0 ['conv4_block12_2_bn[0][0]']
on)
conv4_block12_3_conv (Conv2D) (None, 3, 3, 1024) 263168 ['conv4_block12_2_relu[0][0]']
conv4_block12_3_bn (BatchNorma (None, 3, 3, 1024) 4096 ['conv4_block12_3_conv[0][0]']
lization)
conv4_block12_add (Add) (None, 3, 3, 1024) 0 ['conv4_block11_out[0][0]',
'conv4_block12_3_bn[0][0]']
conv4_block12_out (Activation) (None, 3, 3, 1024) 0 ['conv4_block12_add[0][0]']
conv4_block13_1_conv (Conv2D) (None, 3, 3, 256) 262400 ['conv4_block12_out[0][0]']
conv4_block13_1_bn (BatchNorma (None, 3, 3, 256) 1024 ['conv4_block13_1_conv[0][0]']
lization)
conv4_block13_1_relu (Activati (None, 3, 3, 256) 0 ['conv4_block13_1_bn[0][0]']
on)
conv4_block13_2_conv (Conv2D) (None, 3, 3, 256) 590080 ['conv4_block13_1_relu[0][0]']
conv4_block13_2_bn (BatchNorma (None, 3, 3, 256) 1024 ['conv4_block13_2_conv[0][0]']
lization)
conv4_block13_2_relu (Activati (None, 3, 3, 256) 0 ['conv4_block13_2_bn[0][0]']
on)
conv4_block13_3_conv (Conv2D) (None, 3, 3, 1024) 263168 ['conv4_block13_2_relu[0][0]']
conv4_block13_3_bn (BatchNorma (None, 3, 3, 1024) 4096 ['conv4_block13_3_conv[0][0]']
lization)
conv4_block13_add (Add) (None, 3, 3, 1024) 0 ['conv4_block12_out[0][0]',
'conv4_block13_3_bn[0][0]']
conv4_block13_out (Activation) (None, 3, 3, 1024) 0 ['conv4_block13_add[0][0]']
conv4_block14_1_conv (Conv2D) (None, 3, 3, 256) 262400 ['conv4_block13_out[0][0]']
conv4_block14_1_bn (BatchNorma (None, 3, 3, 256) 1024 ['conv4_block14_1_conv[0][0]']
lization)
conv4_block14_1_relu (Activati (None, 3, 3, 256) 0 ['conv4_block14_1_bn[0][0]']
on)
conv4_block14_2_conv (Conv2D) (None, 3, 3, 256) 590080 ['conv4_block14_1_relu[0][0]']
conv4_block14_2_bn (BatchNorma (None, 3, 3, 256) 1024 ['conv4_block14_2_conv[0][0]']
lization)
conv4_block14_2_relu (Activati (None, 3, 3, 256) 0 ['conv4_block14_2_bn[0][0]']
on)
conv4_block14_3_conv (Conv2D) (None, 3, 3, 1024) 263168 ['conv4_block14_2_relu[0][0]']
conv4_block14_3_bn (BatchNorma (None, 3, 3, 1024) 4096 ['conv4_block14_3_conv[0][0]']
lization)
conv4_block14_add (Add) (None, 3, 3, 1024) 0 ['conv4_block13_out[0][0]',
'conv4_block14_3_bn[0][0]']
conv4_block14_out (Activation) (None, 3, 3, 1024) 0 ['conv4_block14_add[0][0]']
conv4_block15_1_conv (Conv2D) (None, 3, 3, 256) 262400 ['conv4_block14_out[0][0]']
conv4_block15_1_bn (BatchNorma (None, 3, 3, 256) 1024 ['conv4_block15_1_conv[0][0]']
lization)
conv4_block15_1_relu (Activati (None, 3, 3, 256) 0 ['conv4_block15_1_bn[0][0]']
on)
conv4_block15_2_conv (Conv2D) (None, 3, 3, 256) 590080 ['conv4_block15_1_relu[0][0]']
conv4_block15_2_bn (BatchNorma (None, 3, 3, 256) 1024 ['conv4_block15_2_conv[0][0]']
lization)
conv4_block15_2_relu (Activati (None, 3, 3, 256) 0 ['conv4_block15_2_bn[0][0]']
on)
conv4_block15_3_conv (Conv2D) (None, 3, 3, 1024) 263168 ['conv4_block15_2_relu[0][0]']
conv4_block15_3_bn (BatchNorma (None, 3, 3, 1024) 4096 ['conv4_block15_3_conv[0][0]']
lization)
conv4_block15_add (Add) (None, 3, 3, 1024) 0 ['conv4_block14_out[0][0]',
'conv4_block15_3_bn[0][0]']
conv4_block15_out (Activation) (None, 3, 3, 1024) 0 ['conv4_block15_add[0][0]']
conv4_block16_1_conv (Conv2D) (None, 3, 3, 256) 262400 ['conv4_block15_out[0][0]']
conv4_block16_1_bn (BatchNorma (None, 3, 3, 256) 1024 ['conv4_block16_1_conv[0][0]']
lization)
conv4_block16_1_relu (Activati (None, 3, 3, 256) 0 ['conv4_block16_1_bn[0][0]']
on)
conv4_block16_2_conv (Conv2D) (None, 3, 3, 256) 590080 ['conv4_block16_1_relu[0][0]']
conv4_block16_2_bn (BatchNorma (None, 3, 3, 256) 1024 ['conv4_block16_2_conv[0][0]']
lization)
conv4_block16_2_relu (Activati (None, 3, 3, 256) 0 ['conv4_block16_2_bn[0][0]']
on)
conv4_block16_3_conv (Conv2D) (None, 3, 3, 1024) 263168 ['conv4_block16_2_relu[0][0]']
conv4_block16_3_bn (BatchNorma (None, 3, 3, 1024) 4096 ['conv4_block16_3_conv[0][0]']
lization)
conv4_block16_add (Add) (None, 3, 3, 1024) 0 ['conv4_block15_out[0][0]',
'conv4_block16_3_bn[0][0]']
conv4_block16_out (Activation) (None, 3, 3, 1024) 0 ['conv4_block16_add[0][0]']
conv4_block17_1_conv (Conv2D) (None, 3, 3, 256) 262400 ['conv4_block16_out[0][0]']
conv4_block17_1_bn (BatchNorma (None, 3, 3, 256) 1024 ['conv4_block17_1_conv[0][0]']
lization)
conv4_block17_1_relu (Activati (None, 3, 3, 256) 0 ['conv4_block17_1_bn[0][0]']
on)
conv4_block17_2_conv (Conv2D) (None, 3, 3, 256) 590080 ['conv4_block17_1_relu[0][0]']
conv4_block17_2_bn (BatchNorma (None, 3, 3, 256) 1024 ['conv4_block17_2_conv[0][0]']
lization)
conv4_block17_2_relu (Activati (None, 3, 3, 256) 0 ['conv4_block17_2_bn[0][0]']
on)
conv4_block17_3_conv (Conv2D) (None, 3, 3, 1024) 263168 ['conv4_block17_2_relu[0][0]']
conv4_block17_3_bn (BatchNorma (None, 3, 3, 1024) 4096 ['conv4_block17_3_conv[0][0]']
lization)
conv4_block17_add (Add) (None, 3, 3, 1024) 0 ['conv4_block16_out[0][0]',
'conv4_block17_3_bn[0][0]']
conv4_block17_out (Activation) (None, 3, 3, 1024) 0 ['conv4_block17_add[0][0]']
conv4_block18_1_conv (Conv2D) (None, 3, 3, 256) 262400 ['conv4_block17_out[0][0]']
conv4_block18_1_bn (BatchNorma (None, 3, 3, 256) 1024 ['conv4_block18_1_conv[0][0]']
lization)
conv4_block18_1_relu (Activati (None, 3, 3, 256) 0 ['conv4_block18_1_bn[0][0]']
on)
conv4_block18_2_conv (Conv2D) (None, 3, 3, 256) 590080 ['conv4_block18_1_relu[0][0]']
conv4_block18_2_bn (BatchNorma (None, 3, 3, 256) 1024 ['conv4_block18_2_conv[0][0]']
lization)
conv4_block18_2_relu (Activati (None, 3, 3, 256) 0 ['conv4_block18_2_bn[0][0]']
on)
conv4_block18_3_conv (Conv2D) (None, 3, 3, 1024) 263168 ['conv4_block18_2_relu[0][0]']
conv4_block18_3_bn (BatchNorma (None, 3, 3, 1024) 4096 ['conv4_block18_3_conv[0][0]']
lization)
conv4_block18_add (Add) (None, 3, 3, 1024) 0 ['conv4_block17_out[0][0]',
'conv4_block18_3_bn[0][0]']
conv4_block18_out (Activation) (None, 3, 3, 1024) 0 ['conv4_block18_add[0][0]']
conv4_block19_1_conv (Conv2D) (None, 3, 3, 256) 262400 ['conv4_block18_out[0][0]']
conv4_block19_1_bn (BatchNorma (None, 3, 3, 256) 1024 ['conv4_block19_1_conv[0][0]']
lization)
conv4_block19_1_relu (Activati (None, 3, 3, 256) 0 ['conv4_block19_1_bn[0][0]']
on)
conv4_block19_2_conv (Conv2D) (None, 3, 3, 256) 590080 ['conv4_block19_1_relu[0][0]']
conv4_block19_2_bn (BatchNorma (None, 3, 3, 256) 1024 ['conv4_block19_2_conv[0][0]']
lization)
conv4_block19_2_relu (Activati (None, 3, 3, 256) 0 ['conv4_block19_2_bn[0][0]']
on)
conv4_block19_3_conv (Conv2D) (None, 3, 3, 1024) 263168 ['conv4_block19_2_relu[0][0]']
conv4_block19_3_bn (BatchNorma (None, 3, 3, 1024) 4096 ['conv4_block19_3_conv[0][0]']
lization)
conv4_block19_add (Add) (None, 3, 3, 1024) 0 ['conv4_block18_out[0][0]',
'conv4_block19_3_bn[0][0]']
conv4_block19_out (Activation) (None, 3, 3, 1024) 0 ['conv4_block19_add[0][0]']
conv4_block20_1_conv (Conv2D) (None, 3, 3, 256) 262400 ['conv4_block19_out[0][0]']
conv4_block20_1_bn (BatchNorma (None, 3, 3, 256) 1024 ['conv4_block20_1_conv[0][0]']
lization)
conv4_block20_1_relu (Activati (None, 3, 3, 256) 0 ['conv4_block20_1_bn[0][0]']
on)
conv4_block20_2_conv (Conv2D) (None, 3, 3, 256) 590080 ['conv4_block20_1_relu[0][0]']
conv4_block20_2_bn (BatchNorma (None, 3, 3, 256) 1024 ['conv4_block20_2_conv[0][0]']
lization)
conv4_block20_2_relu (Activati (None, 3, 3, 256) 0 ['conv4_block20_2_bn[0][0]']
on)
conv4_block20_3_conv (Conv2D) (None, 3, 3, 1024) 263168 ['conv4_block20_2_relu[0][0]']
conv4_block20_3_bn (BatchNorma (None, 3, 3, 1024) 4096 ['conv4_block20_3_conv[0][0]']
lization)
conv4_block20_add (Add) (None, 3, 3, 1024) 0 ['conv4_block19_out[0][0]',
'conv4_block20_3_bn[0][0]']
conv4_block20_out (Activation) (None, 3, 3, 1024) 0 ['conv4_block20_add[0][0]']
conv4_block21_1_conv (Conv2D) (None, 3, 3, 256) 262400 ['conv4_block20_out[0][0]']
conv4_block21_1_bn (BatchNorma (None, 3, 3, 256) 1024 ['conv4_block21_1_conv[0][0]']
lization)
conv4_block21_1_relu (Activati (None, 3, 3, 256) 0 ['conv4_block21_1_bn[0][0]']
on)
conv4_block21_2_conv (Conv2D) (None, 3, 3, 256) 590080 ['conv4_block21_1_relu[0][0]']
conv4_block21_2_bn (BatchNorma (None, 3, 3, 256) 1024 ['conv4_block21_2_conv[0][0]']
lization)
conv4_block21_2_relu (Activati (None, 3, 3, 256) 0 ['conv4_block21_2_bn[0][0]']
on)
conv4_block21_3_conv (Conv2D) (None, 3, 3, 1024) 263168 ['conv4_block21_2_relu[0][0]']
conv4_block21_3_bn (BatchNorma (None, 3, 3, 1024) 4096 ['conv4_block21_3_conv[0][0]']
lization)
conv4_block21_add (Add) (None, 3, 3, 1024) 0 ['conv4_block20_out[0][0]',
'conv4_block21_3_bn[0][0]']
conv4_block21_out (Activation) (None, 3, 3, 1024) 0 ['conv4_block21_add[0][0]']
conv4_block22_1_conv (Conv2D) (None, 3, 3, 256) 262400 ['conv4_block21_out[0][0]']
conv4_block22_1_bn (BatchNorma (None, 3, 3, 256) 1024 ['conv4_block22_1_conv[0][0]']
lization)
conv4_block22_1_relu (Activati (None, 3, 3, 256) 0 ['conv4_block22_1_bn[0][0]']
on)
conv4_block22_2_conv (Conv2D) (None, 3, 3, 256) 590080 ['conv4_block22_1_relu[0][0]']
conv4_block22_2_bn (BatchNorma (None, 3, 3, 256) 1024 ['conv4_block22_2_conv[0][0]']
lization)
conv4_block22_2_relu (Activati (None, 3, 3, 256) 0 ['conv4_block22_2_bn[0][0]']
on)
conv4_block22_3_conv (Conv2D) (None, 3, 3, 1024) 263168 ['conv4_block22_2_relu[0][0]']
conv4_block22_3_bn (BatchNorma (None, 3, 3, 1024) 4096 ['conv4_block22_3_conv[0][0]']
lization)
conv4_block22_add (Add) (None, 3, 3, 1024) 0 ['conv4_block21_out[0][0]',
'conv4_block22_3_bn[0][0]']
conv4_block22_out (Activation) (None, 3, 3, 1024) 0 ['conv4_block22_add[0][0]']
conv4_block23_1_conv (Conv2D) (None, 3, 3, 256) 262400 ['conv4_block22_out[0][0]']
conv4_block23_1_bn (BatchNorma (None, 3, 3, 256) 1024 ['conv4_block23_1_conv[0][0]']
lization)
conv4_block23_1_relu (Activati (None, 3, 3, 256) 0 ['conv4_block23_1_bn[0][0]']
on)
conv4_block23_2_conv (Conv2D) (None, 3, 3, 256) 590080 ['conv4_block23_1_relu[0][0]']
conv4_block23_2_bn (BatchNorma (None, 3, 3, 256) 1024 ['conv4_block23_2_conv[0][0]']
lization)
conv4_block23_2_relu (Activati (None, 3, 3, 256) 0 ['conv4_block23_2_bn[0][0]']
on)
conv4_block23_3_conv (Conv2D) (None, 3, 3, 1024) 263168 ['conv4_block23_2_relu[0][0]']
conv4_block23_3_bn (BatchNorma (None, 3, 3, 1024) 4096 ['conv4_block23_3_conv[0][0]']
lization)
conv4_block23_add (Add) (None, 3, 3, 1024) 0 ['conv4_block22_out[0][0]',
'conv4_block23_3_bn[0][0]']
conv4_block23_out (Activation) (None, 3, 3, 1024) 0 ['conv4_block23_add[0][0]']
conv5_block1_1_conv (Conv2D) (None, 2, 2, 512) 524800 ['conv4_block23_out[0][0]']
conv5_block1_1_bn (BatchNormal (None, 2, 2, 512) 2048 ['conv5_block1_1_conv[0][0]']
ization)
conv5_block1_1_relu (Activatio (None, 2, 2, 512) 0 ['conv5_block1_1_bn[0][0]']
n)
conv5_block1_2_conv (Conv2D) (None, 2, 2, 512) 2359808 ['conv5_block1_1_relu[0][0]']
conv5_block1_2_bn (BatchNormal (None, 2, 2, 512) 2048 ['conv5_block1_2_conv[0][0]']
ization)
conv5_block1_2_relu (Activatio (None, 2, 2, 512) 0 ['conv5_block1_2_bn[0][0]']
n)
conv5_block1_0_conv (Conv2D) (None, 2, 2, 2048) 2099200 ['conv4_block23_out[0][0]']
conv5_block1_3_conv (Conv2D) (None, 2, 2, 2048) 1050624 ['conv5_block1_2_relu[0][0]']
conv5_block1_0_bn (BatchNormal (None, 2, 2, 2048) 8192 ['conv5_block1_0_conv[0][0]']
ization)
conv5_block1_3_bn (BatchNormal (None, 2, 2, 2048) 8192 ['conv5_block1_3_conv[0][0]']
ization)
conv5_block1_add (Add) (None, 2, 2, 2048) 0 ['conv5_block1_0_bn[0][0]',
'conv5_block1_3_bn[0][0]']
conv5_block1_out (Activation) (None, 2, 2, 2048) 0 ['conv5_block1_add[0][0]']
conv5_block2_1_conv (Conv2D) (None, 2, 2, 512) 1049088 ['conv5_block1_out[0][0]']
conv5_block2_1_bn (BatchNormal (None, 2, 2, 512) 2048 ['conv5_block2_1_conv[0][0]']
ization)
conv5_block2_1_relu (Activatio (None, 2, 2, 512) 0 ['conv5_block2_1_bn[0][0]']
n)
conv5_block2_2_conv (Conv2D) (None, 2, 2, 512) 2359808 ['conv5_block2_1_relu[0][0]']
conv5_block2_2_bn (BatchNormal (None, 2, 2, 512) 2048 ['conv5_block2_2_conv[0][0]']
ization)
conv5_block2_2_relu (Activatio (None, 2, 2, 512) 0 ['conv5_block2_2_bn[0][0]']
n)
conv5_block2_3_conv (Conv2D) (None, 2, 2, 2048) 1050624 ['conv5_block2_2_relu[0][0]']
conv5_block2_3_bn (BatchNormal (None, 2, 2, 2048) 8192 ['conv5_block2_3_conv[0][0]']
ization)
conv5_block2_add (Add) (None, 2, 2, 2048) 0 ['conv5_block1_out[0][0]',
'conv5_block2_3_bn[0][0]']
conv5_block2_out (Activation) (None, 2, 2, 2048) 0 ['conv5_block2_add[0][0]']
conv5_block3_1_conv (Conv2D) (None, 2, 2, 512) 1049088 ['conv5_block2_out[0][0]']
conv5_block3_1_bn (BatchNormal (None, 2, 2, 512) 2048 ['conv5_block3_1_conv[0][0]']
ization)
conv5_block3_1_relu (Activatio (None, 2, 2, 512) 0 ['conv5_block3_1_bn[0][0]']
n)
conv5_block3_2_conv (Conv2D) (None, 2, 2, 512) 2359808 ['conv5_block3_1_relu[0][0]']
conv5_block3_2_bn (BatchNormal (None, 2, 2, 512) 2048 ['conv5_block3_2_conv[0][0]']
ization)
conv5_block3_2_relu (Activatio (None, 2, 2, 512) 0 ['conv5_block3_2_bn[0][0]']
n)
conv5_block3_3_conv (Conv2D) (None, 2, 2, 2048) 1050624 ['conv5_block3_2_relu[0][0]']
conv5_block3_3_bn (BatchNormal (None, 2, 2, 2048) 8192 ['conv5_block3_3_conv[0][0]']
ization)
conv5_block3_add (Add) (None, 2, 2, 2048) 0 ['conv5_block2_out[0][0]',
'conv5_block3_3_bn[0][0]']
conv5_block3_out (Activation) (None, 2, 2, 2048) 0 ['conv5_block3_add[0][0]']
==================================================================================================
Total params: 42,658,176
Trainable params: 42,552,832
Non-trainable params: 105,344
__________________________________________________________________________________________________
We have imported the ResNet v2 model up to layer 'conv_block23_add', as this has shown the best performance compared to other layers (discussed below). The ResNet v2 layers will be frozen, so the only trainable layers will be those we add ourselves. After flattening the input from 'conv_block23_add', we will add the same architecture we did earlier to VGG16, namely 2 dense layers, followed by a DropOut layer, another dense layer, and BatchNormalization. We will once again end with a softmax classifier, as this is a multi-class classification exercise.
transfer_layer = Resnet.get_layer('conv4_block23_add')
Resnet.trainable = False
# Flatten the input
x = Flatten()(transfer_layer.output)
# Dense layers
x = Dense(256, activation='relu')(x)
x = Dense(128, activation='relu')(x)
x = Dropout(0.2)(x)
x = Dense(64, activation='relu')(x)
x = BatchNormalization()(x)
# Classifier
pred = Dense(4, activation='softmax')(x)
# Initialize the model
model_4 = Model(Resnet.input, pred)
# Creating a checkpoint which saves model weights from the best epoch
checkpoint = ModelCheckpoint('./model_4.h5', monitor='val_accuracy', verbose=1, save_best_only=True, mode='auto')
# Initiates early stopping if validation loss does not continue to improve
early_stopping = EarlyStopping(monitor = 'val_loss',
min_delta = 0,
patience = 15, # Increased over initial models otherwise training is cut off too quickly
verbose = 1,
restore_best_weights = True)
# Initiates reduced learning rate if validation loss does not continue to improve
reduce_learningrate = ReduceLROnPlateau(monitor = 'val_loss',
factor = 0.2,
patience = 3,
verbose = 1,
min_delta = 0.0001)
callbacks_list = [checkpoint, early_stopping, reduce_learningrate]
# Compiling model with optimizer set to Adam, loss set to categorical_crossentropy, and metrics set to accuracy
model_4.compile(optimizer = Adam(learning_rate = 0.001), loss = 'categorical_crossentropy', metrics = ['accuracy'])
# Fitting model with epochs set to 100
history_4 = model_4.fit(train_set_rgb, validation_data = val_set_rgb, epochs = 100, callbacks = callbacks_list)
Epoch 1/100 473/473 [==============================] - ETA: 0s - loss: 1.4286 - accuracy: 0.2619 Epoch 1: val_accuracy improved from -inf to 0.36287, saving model to ./model_4.h5 473/473 [==============================] - 41s 80ms/step - loss: 1.4286 - accuracy: 0.2619 - val_loss: 1.3548 - val_accuracy: 0.3629 - lr: 0.0010 Epoch 2/100 473/473 [==============================] - ETA: 0s - loss: 1.4022 - accuracy: 0.2626 Epoch 2: val_accuracy did not improve from 0.36287 473/473 [==============================] - 32s 66ms/step - loss: 1.4022 - accuracy: 0.2626 - val_loss: 1.4131 - val_accuracy: 0.2443 - lr: 0.0010 Epoch 3/100 473/473 [==============================] - ETA: 0s - loss: 1.3988 - accuracy: 0.2634 Epoch 3: val_accuracy did not improve from 0.36287 473/473 [==============================] - 32s 67ms/step - loss: 1.3988 - accuracy: 0.2634 - val_loss: 1.4782 - val_accuracy: 0.2289 - lr: 0.0010 Epoch 4/100 472/473 [============================>.] - ETA: 0s - loss: 1.3911 - accuracy: 0.2682 Epoch 4: val_accuracy did not improve from 0.36287 Epoch 4: ReduceLROnPlateau reducing learning rate to 0.00020000000949949026. 473/473 [==============================] - 32s 67ms/step - loss: 1.3911 - accuracy: 0.2681 - val_loss: 1.4236 - val_accuracy: 0.2443 - lr: 0.0010 Epoch 5/100 473/473 [==============================] - ETA: 0s - loss: 1.3799 - accuracy: 0.2763 Epoch 5: val_accuracy did not improve from 0.36287 473/473 [==============================] - 33s 70ms/step - loss: 1.3799 - accuracy: 0.2763 - val_loss: 1.4006 - val_accuracy: 0.2329 - lr: 2.0000e-04 Epoch 6/100 472/473 [============================>.] - ETA: 0s - loss: 1.3757 - accuracy: 0.2825 Epoch 6: val_accuracy did not improve from 0.36287 473/473 [==============================] - 32s 68ms/step - loss: 1.3758 - accuracy: 0.2825 - val_loss: 1.4284 - val_accuracy: 0.2524 - lr: 2.0000e-04 Epoch 7/100 473/473 [==============================] - ETA: 0s - loss: 1.3730 - accuracy: 0.2878 Epoch 7: val_accuracy did not improve from 0.36287 Epoch 7: ReduceLROnPlateau reducing learning rate to 4.0000001899898055e-05. 473/473 [==============================] - 32s 68ms/step - loss: 1.3730 - accuracy: 0.2878 - val_loss: 1.3915 - val_accuracy: 0.2443 - lr: 2.0000e-04 Epoch 8/100 473/473 [==============================] - ETA: 0s - loss: 1.3691 - accuracy: 0.2905 Epoch 8: val_accuracy did not improve from 0.36287 473/473 [==============================] - 33s 70ms/step - loss: 1.3691 - accuracy: 0.2905 - val_loss: 1.4117 - val_accuracy: 0.2560 - lr: 4.0000e-05 Epoch 9/100 473/473 [==============================] - ETA: 0s - loss: 1.3658 - accuracy: 0.3011 Epoch 9: val_accuracy did not improve from 0.36287 473/473 [==============================] - 32s 68ms/step - loss: 1.3658 - accuracy: 0.3011 - val_loss: 1.4033 - val_accuracy: 0.2556 - lr: 4.0000e-05 Epoch 10/100 472/473 [============================>.] - ETA: 0s - loss: 1.3664 - accuracy: 0.2946 Epoch 10: val_accuracy did not improve from 0.36287 Epoch 10: ReduceLROnPlateau reducing learning rate to 8.000000525498762e-06. 473/473 [==============================] - 32s 67ms/step - loss: 1.3666 - accuracy: 0.2944 - val_loss: 1.4060 - val_accuracy: 0.2431 - lr: 4.0000e-05 Epoch 11/100 473/473 [==============================] - ETA: 0s - loss: 1.3639 - accuracy: 0.3004 Epoch 11: val_accuracy did not improve from 0.36287 473/473 [==============================] - 33s 71ms/step - loss: 1.3639 - accuracy: 0.3004 - val_loss: 1.3998 - val_accuracy: 0.2558 - lr: 8.0000e-06 Epoch 12/100 473/473 [==============================] - ETA: 0s - loss: 1.3632 - accuracy: 0.2994 Epoch 12: val_accuracy did not improve from 0.36287 473/473 [==============================] - 33s 69ms/step - loss: 1.3632 - accuracy: 0.2994 - val_loss: 1.3965 - val_accuracy: 0.2574 - lr: 8.0000e-06 Epoch 13/100 473/473 [==============================] - ETA: 0s - loss: 1.3662 - accuracy: 0.3015 Epoch 13: val_accuracy did not improve from 0.36287 Epoch 13: ReduceLROnPlateau reducing learning rate to 1.6000001778593287e-06. 473/473 [==============================] - 32s 67ms/step - loss: 1.3662 - accuracy: 0.3015 - val_loss: 1.3940 - val_accuracy: 0.2592 - lr: 8.0000e-06 Epoch 14/100 473/473 [==============================] - ETA: 0s - loss: 1.3655 - accuracy: 0.2984 Epoch 14: val_accuracy did not improve from 0.36287 473/473 [==============================] - 32s 67ms/step - loss: 1.3655 - accuracy: 0.2984 - val_loss: 1.3998 - val_accuracy: 0.2542 - lr: 1.6000e-06 Epoch 15/100 473/473 [==============================] - ETA: 0s - loss: 1.3623 - accuracy: 0.3027 Epoch 15: val_accuracy did not improve from 0.36287 473/473 [==============================] - 33s 71ms/step - loss: 1.3623 - accuracy: 0.3027 - val_loss: 1.3961 - val_accuracy: 0.2495 - lr: 1.6000e-06 Epoch 16/100 473/473 [==============================] - ETA: 0s - loss: 1.3630 - accuracy: 0.2998 Epoch 16: val_accuracy did not improve from 0.36287 Restoring model weights from the end of the best epoch: 1. Epoch 16: ReduceLROnPlateau reducing learning rate to 3.200000264769187e-07. 473/473 [==============================] - 32s 68ms/step - loss: 1.3630 - accuracy: 0.2998 - val_loss: 1.3992 - val_accuracy: 0.2550 - lr: 1.6000e-06 Epoch 16: early stopping
# Plotting the accuracies
plt.figure(figsize = (10, 5))
plt.plot(history_4.history['accuracy'])
plt.plot(history_4.history['val_accuracy'])
plt.title('Accuracy - ResNet V2 Model')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Training', 'Validation'], loc='lower right')
plt.show()
# Plotting the losses
plt.figure(figsize = (10, 5))
plt.plot(history_4.history['loss'])
plt.plot(history_4.history['val_loss'])
plt.title('Loss - ResNet V2 Model')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['Training', 'Validation'], loc='upper right')
plt.show()
# Evaluating the model's performance on the test set
accuracy = model_4.evaluate(test_set_rgb)
4/4 [==============================] - 0s 46ms/step - loss: 1.4027 - accuracy: 0.2812
Observations and Insights:
As imported and modified, our transfer learning model shows terrible performance. After just 1 epoch (the 'best' epoch!), training accuracy stands at 0.26 and validation accuracy is 0.36. Accuracy and loss for both training and validation data level off fairly quickly at which point early stopping aborts the training. The above accuracy and loss curves paint the picture of poor model that will not generalize well at all. The model's test accuracy comes in at 0.34.
The ResNet v2 model was ultimately imported up to layer 'conv4_block23_add', as it produced the 'best' performance, though it was difficult to choose. A history of alternative models is below.
| Train Loss | Train Accuracy | Val Loss | Val Accuracy | |
|---|---|---|---|---|
| ResNet V2 conv4_block23_add (selected) | 1.43 | 0.26 | 1.35 | 0.36 |
| ResNet V2 conv5_block3_add | 1.47 | 0.23 | 1.43 | 0.33 |
| ResNet V2 conv3_block4_add | 1.49 | 0.22 | 1.44 | 0.33 |
| ResNet V2 conv2_block3_add | 1.51 | 0.21 | 1.55 | 0.21 |
Our third transfer learning model is EfficientNet, which is a CNN that uses 'compound scaling' to improve efficiency and, theoretically at least, performance. Like VGG16 and ResNet v2, color_mode must be set to RGB to leverage this pre-trained architecture.
EfficientNet = ap.EfficientNetV2B2(include_top=False, weights="imagenet", input_shape= (48, 48, 3))
EfficientNet.summary()
Metal device set to: Apple M1 Pro
Model: "efficientnetv2-b2"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) [(None, 48, 48, 3)] 0 []
rescaling (Rescaling) (None, 48, 48, 3) 0 ['input_1[0][0]']
normalization (Normalization) (None, 48, 48, 3) 0 ['rescaling[0][0]']
stem_conv (Conv2D) (None, 24, 24, 32) 864 ['normalization[0][0]']
stem_bn (BatchNormalization) (None, 24, 24, 32) 128 ['stem_conv[0][0]']
stem_activation (Activation) (None, 24, 24, 32) 0 ['stem_bn[0][0]']
block1a_project_conv (Conv2D) (None, 24, 24, 16) 4608 ['stem_activation[0][0]']
block1a_project_bn (BatchNorma (None, 24, 24, 16) 64 ['block1a_project_conv[0][0]']
lization)
block1a_project_activation (Ac (None, 24, 24, 16) 0 ['block1a_project_bn[0][0]']
tivation)
block1b_project_conv (Conv2D) (None, 24, 24, 16) 2304 ['block1a_project_activation[0][0
]']
block1b_project_bn (BatchNorma (None, 24, 24, 16) 64 ['block1b_project_conv[0][0]']
lization)
block1b_project_activation (Ac (None, 24, 24, 16) 0 ['block1b_project_bn[0][0]']
tivation)
block1b_drop (Dropout) (None, 24, 24, 16) 0 ['block1b_project_activation[0][0
]']
block1b_add (Add) (None, 24, 24, 16) 0 ['block1b_drop[0][0]',
'block1a_project_activation[0][0
]']
block2a_expand_conv (Conv2D) (None, 12, 12, 64) 9216 ['block1b_add[0][0]']
block2a_expand_bn (BatchNormal (None, 12, 12, 64) 256 ['block2a_expand_conv[0][0]']
ization)
block2a_expand_activation (Act (None, 12, 12, 64) 0 ['block2a_expand_bn[0][0]']
ivation)
block2a_project_conv (Conv2D) (None, 12, 12, 32) 2048 ['block2a_expand_activation[0][0]
']
block2a_project_bn (BatchNorma (None, 12, 12, 32) 128 ['block2a_project_conv[0][0]']
lization)
block2b_expand_conv (Conv2D) (None, 12, 12, 128) 36864 ['block2a_project_bn[0][0]']
block2b_expand_bn (BatchNormal (None, 12, 12, 128) 512 ['block2b_expand_conv[0][0]']
ization)
block2b_expand_activation (Act (None, 12, 12, 128) 0 ['block2b_expand_bn[0][0]']
ivation)
block2b_project_conv (Conv2D) (None, 12, 12, 32) 4096 ['block2b_expand_activation[0][0]
']
block2b_project_bn (BatchNorma (None, 12, 12, 32) 128 ['block2b_project_conv[0][0]']
lization)
block2b_drop (Dropout) (None, 12, 12, 32) 0 ['block2b_project_bn[0][0]']
block2b_add (Add) (None, 12, 12, 32) 0 ['block2b_drop[0][0]',
'block2a_project_bn[0][0]']
block2c_expand_conv (Conv2D) (None, 12, 12, 128) 36864 ['block2b_add[0][0]']
block2c_expand_bn (BatchNormal (None, 12, 12, 128) 512 ['block2c_expand_conv[0][0]']
ization)
block2c_expand_activation (Act (None, 12, 12, 128) 0 ['block2c_expand_bn[0][0]']
ivation)
block2c_project_conv (Conv2D) (None, 12, 12, 32) 4096 ['block2c_expand_activation[0][0]
']
block2c_project_bn (BatchNorma (None, 12, 12, 32) 128 ['block2c_project_conv[0][0]']
lization)
block2c_drop (Dropout) (None, 12, 12, 32) 0 ['block2c_project_bn[0][0]']
block2c_add (Add) (None, 12, 12, 32) 0 ['block2c_drop[0][0]',
'block2b_add[0][0]']
block3a_expand_conv (Conv2D) (None, 6, 6, 128) 36864 ['block2c_add[0][0]']
block3a_expand_bn (BatchNormal (None, 6, 6, 128) 512 ['block3a_expand_conv[0][0]']
ization)
block3a_expand_activation (Act (None, 6, 6, 128) 0 ['block3a_expand_bn[0][0]']
ivation)
block3a_project_conv (Conv2D) (None, 6, 6, 56) 7168 ['block3a_expand_activation[0][0]
']
block3a_project_bn (BatchNorma (None, 6, 6, 56) 224 ['block3a_project_conv[0][0]']
lization)
block3b_expand_conv (Conv2D) (None, 6, 6, 224) 112896 ['block3a_project_bn[0][0]']
block3b_expand_bn (BatchNormal (None, 6, 6, 224) 896 ['block3b_expand_conv[0][0]']
ization)
block3b_expand_activation (Act (None, 6, 6, 224) 0 ['block3b_expand_bn[0][0]']
ivation)
block3b_project_conv (Conv2D) (None, 6, 6, 56) 12544 ['block3b_expand_activation[0][0]
']
block3b_project_bn (BatchNorma (None, 6, 6, 56) 224 ['block3b_project_conv[0][0]']
lization)
block3b_drop (Dropout) (None, 6, 6, 56) 0 ['block3b_project_bn[0][0]']
block3b_add (Add) (None, 6, 6, 56) 0 ['block3b_drop[0][0]',
'block3a_project_bn[0][0]']
block3c_expand_conv (Conv2D) (None, 6, 6, 224) 112896 ['block3b_add[0][0]']
block3c_expand_bn (BatchNormal (None, 6, 6, 224) 896 ['block3c_expand_conv[0][0]']
ization)
block3c_expand_activation (Act (None, 6, 6, 224) 0 ['block3c_expand_bn[0][0]']
ivation)
block3c_project_conv (Conv2D) (None, 6, 6, 56) 12544 ['block3c_expand_activation[0][0]
']
block3c_project_bn (BatchNorma (None, 6, 6, 56) 224 ['block3c_project_conv[0][0]']
lization)
block3c_drop (Dropout) (None, 6, 6, 56) 0 ['block3c_project_bn[0][0]']
block3c_add (Add) (None, 6, 6, 56) 0 ['block3c_drop[0][0]',
'block3b_add[0][0]']
block4a_expand_conv (Conv2D) (None, 6, 6, 224) 12544 ['block3c_add[0][0]']
block4a_expand_bn (BatchNormal (None, 6, 6, 224) 896 ['block4a_expand_conv[0][0]']
ization)
block4a_expand_activation (Act (None, 6, 6, 224) 0 ['block4a_expand_bn[0][0]']
ivation)
block4a_dwconv2 (DepthwiseConv (None, 3, 3, 224) 2016 ['block4a_expand_activation[0][0]
2D) ']
block4a_bn (BatchNormalization (None, 3, 3, 224) 896 ['block4a_dwconv2[0][0]']
)
block4a_activation (Activation (None, 3, 3, 224) 0 ['block4a_bn[0][0]']
)
block4a_se_squeeze (GlobalAver (None, 224) 0 ['block4a_activation[0][0]']
agePooling2D)
block4a_se_reshape (Reshape) (None, 1, 1, 224) 0 ['block4a_se_squeeze[0][0]']
block4a_se_reduce (Conv2D) (None, 1, 1, 14) 3150 ['block4a_se_reshape[0][0]']
block4a_se_expand (Conv2D) (None, 1, 1, 224) 3360 ['block4a_se_reduce[0][0]']
block4a_se_excite (Multiply) (None, 3, 3, 224) 0 ['block4a_activation[0][0]',
'block4a_se_expand[0][0]']
block4a_project_conv (Conv2D) (None, 3, 3, 104) 23296 ['block4a_se_excite[0][0]']
block4a_project_bn (BatchNorma (None, 3, 3, 104) 416 ['block4a_project_conv[0][0]']
lization)
block4b_expand_conv (Conv2D) (None, 3, 3, 416) 43264 ['block4a_project_bn[0][0]']
block4b_expand_bn (BatchNormal (None, 3, 3, 416) 1664 ['block4b_expand_conv[0][0]']
ization)
block4b_expand_activation (Act (None, 3, 3, 416) 0 ['block4b_expand_bn[0][0]']
ivation)
block4b_dwconv2 (DepthwiseConv (None, 3, 3, 416) 3744 ['block4b_expand_activation[0][0]
2D) ']
block4b_bn (BatchNormalization (None, 3, 3, 416) 1664 ['block4b_dwconv2[0][0]']
)
block4b_activation (Activation (None, 3, 3, 416) 0 ['block4b_bn[0][0]']
)
block4b_se_squeeze (GlobalAver (None, 416) 0 ['block4b_activation[0][0]']
agePooling2D)
block4b_se_reshape (Reshape) (None, 1, 1, 416) 0 ['block4b_se_squeeze[0][0]']
block4b_se_reduce (Conv2D) (None, 1, 1, 26) 10842 ['block4b_se_reshape[0][0]']
block4b_se_expand (Conv2D) (None, 1, 1, 416) 11232 ['block4b_se_reduce[0][0]']
block4b_se_excite (Multiply) (None, 3, 3, 416) 0 ['block4b_activation[0][0]',
'block4b_se_expand[0][0]']
block4b_project_conv (Conv2D) (None, 3, 3, 104) 43264 ['block4b_se_excite[0][0]']
block4b_project_bn (BatchNorma (None, 3, 3, 104) 416 ['block4b_project_conv[0][0]']
lization)
block4b_drop (Dropout) (None, 3, 3, 104) 0 ['block4b_project_bn[0][0]']
block4b_add (Add) (None, 3, 3, 104) 0 ['block4b_drop[0][0]',
'block4a_project_bn[0][0]']
block4c_expand_conv (Conv2D) (None, 3, 3, 416) 43264 ['block4b_add[0][0]']
block4c_expand_bn (BatchNormal (None, 3, 3, 416) 1664 ['block4c_expand_conv[0][0]']
ization)
block4c_expand_activation (Act (None, 3, 3, 416) 0 ['block4c_expand_bn[0][0]']
ivation)
block4c_dwconv2 (DepthwiseConv (None, 3, 3, 416) 3744 ['block4c_expand_activation[0][0]
2D) ']
block4c_bn (BatchNormalization (None, 3, 3, 416) 1664 ['block4c_dwconv2[0][0]']
)
block4c_activation (Activation (None, 3, 3, 416) 0 ['block4c_bn[0][0]']
)
block4c_se_squeeze (GlobalAver (None, 416) 0 ['block4c_activation[0][0]']
agePooling2D)
block4c_se_reshape (Reshape) (None, 1, 1, 416) 0 ['block4c_se_squeeze[0][0]']
block4c_se_reduce (Conv2D) (None, 1, 1, 26) 10842 ['block4c_se_reshape[0][0]']
block4c_se_expand (Conv2D) (None, 1, 1, 416) 11232 ['block4c_se_reduce[0][0]']
block4c_se_excite (Multiply) (None, 3, 3, 416) 0 ['block4c_activation[0][0]',
'block4c_se_expand[0][0]']
block4c_project_conv (Conv2D) (None, 3, 3, 104) 43264 ['block4c_se_excite[0][0]']
block4c_project_bn (BatchNorma (None, 3, 3, 104) 416 ['block4c_project_conv[0][0]']
lization)
block4c_drop (Dropout) (None, 3, 3, 104) 0 ['block4c_project_bn[0][0]']
block4c_add (Add) (None, 3, 3, 104) 0 ['block4c_drop[0][0]',
'block4b_add[0][0]']
block4d_expand_conv (Conv2D) (None, 3, 3, 416) 43264 ['block4c_add[0][0]']
block4d_expand_bn (BatchNormal (None, 3, 3, 416) 1664 ['block4d_expand_conv[0][0]']
ization)
block4d_expand_activation (Act (None, 3, 3, 416) 0 ['block4d_expand_bn[0][0]']
ivation)
block4d_dwconv2 (DepthwiseConv (None, 3, 3, 416) 3744 ['block4d_expand_activation[0][0]
2D) ']
block4d_bn (BatchNormalization (None, 3, 3, 416) 1664 ['block4d_dwconv2[0][0]']
)
block4d_activation (Activation (None, 3, 3, 416) 0 ['block4d_bn[0][0]']
)
block4d_se_squeeze (GlobalAver (None, 416) 0 ['block4d_activation[0][0]']
agePooling2D)
block4d_se_reshape (Reshape) (None, 1, 1, 416) 0 ['block4d_se_squeeze[0][0]']
block4d_se_reduce (Conv2D) (None, 1, 1, 26) 10842 ['block4d_se_reshape[0][0]']
block4d_se_expand (Conv2D) (None, 1, 1, 416) 11232 ['block4d_se_reduce[0][0]']
block4d_se_excite (Multiply) (None, 3, 3, 416) 0 ['block4d_activation[0][0]',
'block4d_se_expand[0][0]']
block4d_project_conv (Conv2D) (None, 3, 3, 104) 43264 ['block4d_se_excite[0][0]']
block4d_project_bn (BatchNorma (None, 3, 3, 104) 416 ['block4d_project_conv[0][0]']
lization)
block4d_drop (Dropout) (None, 3, 3, 104) 0 ['block4d_project_bn[0][0]']
block4d_add (Add) (None, 3, 3, 104) 0 ['block4d_drop[0][0]',
'block4c_add[0][0]']
block5a_expand_conv (Conv2D) (None, 3, 3, 624) 64896 ['block4d_add[0][0]']
block5a_expand_bn (BatchNormal (None, 3, 3, 624) 2496 ['block5a_expand_conv[0][0]']
ization)
block5a_expand_activation (Act (None, 3, 3, 624) 0 ['block5a_expand_bn[0][0]']
ivation)
block5a_dwconv2 (DepthwiseConv (None, 3, 3, 624) 5616 ['block5a_expand_activation[0][0]
2D) ']
block5a_bn (BatchNormalization (None, 3, 3, 624) 2496 ['block5a_dwconv2[0][0]']
)
block5a_activation (Activation (None, 3, 3, 624) 0 ['block5a_bn[0][0]']
)
block5a_se_squeeze (GlobalAver (None, 624) 0 ['block5a_activation[0][0]']
agePooling2D)
block5a_se_reshape (Reshape) (None, 1, 1, 624) 0 ['block5a_se_squeeze[0][0]']
block5a_se_reduce (Conv2D) (None, 1, 1, 26) 16250 ['block5a_se_reshape[0][0]']
block5a_se_expand (Conv2D) (None, 1, 1, 624) 16848 ['block5a_se_reduce[0][0]']
block5a_se_excite (Multiply) (None, 3, 3, 624) 0 ['block5a_activation[0][0]',
'block5a_se_expand[0][0]']
block5a_project_conv (Conv2D) (None, 3, 3, 120) 74880 ['block5a_se_excite[0][0]']
block5a_project_bn (BatchNorma (None, 3, 3, 120) 480 ['block5a_project_conv[0][0]']
lization)
block5b_expand_conv (Conv2D) (None, 3, 3, 720) 86400 ['block5a_project_bn[0][0]']
block5b_expand_bn (BatchNormal (None, 3, 3, 720) 2880 ['block5b_expand_conv[0][0]']
ization)
block5b_expand_activation (Act (None, 3, 3, 720) 0 ['block5b_expand_bn[0][0]']
ivation)
block5b_dwconv2 (DepthwiseConv (None, 3, 3, 720) 6480 ['block5b_expand_activation[0][0]
2D) ']
block5b_bn (BatchNormalization (None, 3, 3, 720) 2880 ['block5b_dwconv2[0][0]']
)
block5b_activation (Activation (None, 3, 3, 720) 0 ['block5b_bn[0][0]']
)
block5b_se_squeeze (GlobalAver (None, 720) 0 ['block5b_activation[0][0]']
agePooling2D)
block5b_se_reshape (Reshape) (None, 1, 1, 720) 0 ['block5b_se_squeeze[0][0]']
block5b_se_reduce (Conv2D) (None, 1, 1, 30) 21630 ['block5b_se_reshape[0][0]']
block5b_se_expand (Conv2D) (None, 1, 1, 720) 22320 ['block5b_se_reduce[0][0]']
block5b_se_excite (Multiply) (None, 3, 3, 720) 0 ['block5b_activation[0][0]',
'block5b_se_expand[0][0]']
block5b_project_conv (Conv2D) (None, 3, 3, 120) 86400 ['block5b_se_excite[0][0]']
block5b_project_bn (BatchNorma (None, 3, 3, 120) 480 ['block5b_project_conv[0][0]']
lization)
block5b_drop (Dropout) (None, 3, 3, 120) 0 ['block5b_project_bn[0][0]']
block5b_add (Add) (None, 3, 3, 120) 0 ['block5b_drop[0][0]',
'block5a_project_bn[0][0]']
block5c_expand_conv (Conv2D) (None, 3, 3, 720) 86400 ['block5b_add[0][0]']
block5c_expand_bn (BatchNormal (None, 3, 3, 720) 2880 ['block5c_expand_conv[0][0]']
ization)
block5c_expand_activation (Act (None, 3, 3, 720) 0 ['block5c_expand_bn[0][0]']
ivation)
block5c_dwconv2 (DepthwiseConv (None, 3, 3, 720) 6480 ['block5c_expand_activation[0][0]
2D) ']
block5c_bn (BatchNormalization (None, 3, 3, 720) 2880 ['block5c_dwconv2[0][0]']
)
block5c_activation (Activation (None, 3, 3, 720) 0 ['block5c_bn[0][0]']
)
block5c_se_squeeze (GlobalAver (None, 720) 0 ['block5c_activation[0][0]']
agePooling2D)
block5c_se_reshape (Reshape) (None, 1, 1, 720) 0 ['block5c_se_squeeze[0][0]']
block5c_se_reduce (Conv2D) (None, 1, 1, 30) 21630 ['block5c_se_reshape[0][0]']
block5c_se_expand (Conv2D) (None, 1, 1, 720) 22320 ['block5c_se_reduce[0][0]']
block5c_se_excite (Multiply) (None, 3, 3, 720) 0 ['block5c_activation[0][0]',
'block5c_se_expand[0][0]']
block5c_project_conv (Conv2D) (None, 3, 3, 120) 86400 ['block5c_se_excite[0][0]']
block5c_project_bn (BatchNorma (None, 3, 3, 120) 480 ['block5c_project_conv[0][0]']
lization)
block5c_drop (Dropout) (None, 3, 3, 120) 0 ['block5c_project_bn[0][0]']
block5c_add (Add) (None, 3, 3, 120) 0 ['block5c_drop[0][0]',
'block5b_add[0][0]']
block5d_expand_conv (Conv2D) (None, 3, 3, 720) 86400 ['block5c_add[0][0]']
block5d_expand_bn (BatchNormal (None, 3, 3, 720) 2880 ['block5d_expand_conv[0][0]']
ization)
block5d_expand_activation (Act (None, 3, 3, 720) 0 ['block5d_expand_bn[0][0]']
ivation)
block5d_dwconv2 (DepthwiseConv (None, 3, 3, 720) 6480 ['block5d_expand_activation[0][0]
2D) ']
block5d_bn (BatchNormalization (None, 3, 3, 720) 2880 ['block5d_dwconv2[0][0]']
)
block5d_activation (Activation (None, 3, 3, 720) 0 ['block5d_bn[0][0]']
)
block5d_se_squeeze (GlobalAver (None, 720) 0 ['block5d_activation[0][0]']
agePooling2D)
block5d_se_reshape (Reshape) (None, 1, 1, 720) 0 ['block5d_se_squeeze[0][0]']
block5d_se_reduce (Conv2D) (None, 1, 1, 30) 21630 ['block5d_se_reshape[0][0]']
block5d_se_expand (Conv2D) (None, 1, 1, 720) 22320 ['block5d_se_reduce[0][0]']
block5d_se_excite (Multiply) (None, 3, 3, 720) 0 ['block5d_activation[0][0]',
'block5d_se_expand[0][0]']
block5d_project_conv (Conv2D) (None, 3, 3, 120) 86400 ['block5d_se_excite[0][0]']
block5d_project_bn (BatchNorma (None, 3, 3, 120) 480 ['block5d_project_conv[0][0]']
lization)
block5d_drop (Dropout) (None, 3, 3, 120) 0 ['block5d_project_bn[0][0]']
block5d_add (Add) (None, 3, 3, 120) 0 ['block5d_drop[0][0]',
'block5c_add[0][0]']
block5e_expand_conv (Conv2D) (None, 3, 3, 720) 86400 ['block5d_add[0][0]']
block5e_expand_bn (BatchNormal (None, 3, 3, 720) 2880 ['block5e_expand_conv[0][0]']
ization)
block5e_expand_activation (Act (None, 3, 3, 720) 0 ['block5e_expand_bn[0][0]']
ivation)
block5e_dwconv2 (DepthwiseConv (None, 3, 3, 720) 6480 ['block5e_expand_activation[0][0]
2D) ']
block5e_bn (BatchNormalization (None, 3, 3, 720) 2880 ['block5e_dwconv2[0][0]']
)
block5e_activation (Activation (None, 3, 3, 720) 0 ['block5e_bn[0][0]']
)
block5e_se_squeeze (GlobalAver (None, 720) 0 ['block5e_activation[0][0]']
agePooling2D)
block5e_se_reshape (Reshape) (None, 1, 1, 720) 0 ['block5e_se_squeeze[0][0]']
block5e_se_reduce (Conv2D) (None, 1, 1, 30) 21630 ['block5e_se_reshape[0][0]']
block5e_se_expand (Conv2D) (None, 1, 1, 720) 22320 ['block5e_se_reduce[0][0]']
block5e_se_excite (Multiply) (None, 3, 3, 720) 0 ['block5e_activation[0][0]',
'block5e_se_expand[0][0]']
block5e_project_conv (Conv2D) (None, 3, 3, 120) 86400 ['block5e_se_excite[0][0]']
block5e_project_bn (BatchNorma (None, 3, 3, 120) 480 ['block5e_project_conv[0][0]']
lization)
block5e_drop (Dropout) (None, 3, 3, 120) 0 ['block5e_project_bn[0][0]']
block5e_add (Add) (None, 3, 3, 120) 0 ['block5e_drop[0][0]',
'block5d_add[0][0]']
block5f_expand_conv (Conv2D) (None, 3, 3, 720) 86400 ['block5e_add[0][0]']
block5f_expand_bn (BatchNormal (None, 3, 3, 720) 2880 ['block5f_expand_conv[0][0]']
ization)
block5f_expand_activation (Act (None, 3, 3, 720) 0 ['block5f_expand_bn[0][0]']
ivation)
block5f_dwconv2 (DepthwiseConv (None, 3, 3, 720) 6480 ['block5f_expand_activation[0][0]
2D) ']
block5f_bn (BatchNormalization (None, 3, 3, 720) 2880 ['block5f_dwconv2[0][0]']
)
block5f_activation (Activation (None, 3, 3, 720) 0 ['block5f_bn[0][0]']
)
block5f_se_squeeze (GlobalAver (None, 720) 0 ['block5f_activation[0][0]']
agePooling2D)
block5f_se_reshape (Reshape) (None, 1, 1, 720) 0 ['block5f_se_squeeze[0][0]']
block5f_se_reduce (Conv2D) (None, 1, 1, 30) 21630 ['block5f_se_reshape[0][0]']
block5f_se_expand (Conv2D) (None, 1, 1, 720) 22320 ['block5f_se_reduce[0][0]']
block5f_se_excite (Multiply) (None, 3, 3, 720) 0 ['block5f_activation[0][0]',
'block5f_se_expand[0][0]']
block5f_project_conv (Conv2D) (None, 3, 3, 120) 86400 ['block5f_se_excite[0][0]']
block5f_project_bn (BatchNorma (None, 3, 3, 120) 480 ['block5f_project_conv[0][0]']
lization)
block5f_drop (Dropout) (None, 3, 3, 120) 0 ['block5f_project_bn[0][0]']
block5f_add (Add) (None, 3, 3, 120) 0 ['block5f_drop[0][0]',
'block5e_add[0][0]']
block6a_expand_conv (Conv2D) (None, 3, 3, 720) 86400 ['block5f_add[0][0]']
block6a_expand_bn (BatchNormal (None, 3, 3, 720) 2880 ['block6a_expand_conv[0][0]']
ization)
block6a_expand_activation (Act (None, 3, 3, 720) 0 ['block6a_expand_bn[0][0]']
ivation)
block6a_dwconv2 (DepthwiseConv (None, 2, 2, 720) 6480 ['block6a_expand_activation[0][0]
2D) ']
block6a_bn (BatchNormalization (None, 2, 2, 720) 2880 ['block6a_dwconv2[0][0]']
)
block6a_activation (Activation (None, 2, 2, 720) 0 ['block6a_bn[0][0]']
)
block6a_se_squeeze (GlobalAver (None, 720) 0 ['block6a_activation[0][0]']
agePooling2D)
block6a_se_reshape (Reshape) (None, 1, 1, 720) 0 ['block6a_se_squeeze[0][0]']
block6a_se_reduce (Conv2D) (None, 1, 1, 30) 21630 ['block6a_se_reshape[0][0]']
block6a_se_expand (Conv2D) (None, 1, 1, 720) 22320 ['block6a_se_reduce[0][0]']
block6a_se_excite (Multiply) (None, 2, 2, 720) 0 ['block6a_activation[0][0]',
'block6a_se_expand[0][0]']
block6a_project_conv (Conv2D) (None, 2, 2, 208) 149760 ['block6a_se_excite[0][0]']
block6a_project_bn (BatchNorma (None, 2, 2, 208) 832 ['block6a_project_conv[0][0]']
lization)
block6b_expand_conv (Conv2D) (None, 2, 2, 1248) 259584 ['block6a_project_bn[0][0]']
block6b_expand_bn (BatchNormal (None, 2, 2, 1248) 4992 ['block6b_expand_conv[0][0]']
ization)
block6b_expand_activation (Act (None, 2, 2, 1248) 0 ['block6b_expand_bn[0][0]']
ivation)
block6b_dwconv2 (DepthwiseConv (None, 2, 2, 1248) 11232 ['block6b_expand_activation[0][0]
2D) ']
block6b_bn (BatchNormalization (None, 2, 2, 1248) 4992 ['block6b_dwconv2[0][0]']
)
block6b_activation (Activation (None, 2, 2, 1248) 0 ['block6b_bn[0][0]']
)
block6b_se_squeeze (GlobalAver (None, 1248) 0 ['block6b_activation[0][0]']
agePooling2D)
block6b_se_reshape (Reshape) (None, 1, 1, 1248) 0 ['block6b_se_squeeze[0][0]']
block6b_se_reduce (Conv2D) (None, 1, 1, 52) 64948 ['block6b_se_reshape[0][0]']
block6b_se_expand (Conv2D) (None, 1, 1, 1248) 66144 ['block6b_se_reduce[0][0]']
block6b_se_excite (Multiply) (None, 2, 2, 1248) 0 ['block6b_activation[0][0]',
'block6b_se_expand[0][0]']
block6b_project_conv (Conv2D) (None, 2, 2, 208) 259584 ['block6b_se_excite[0][0]']
block6b_project_bn (BatchNorma (None, 2, 2, 208) 832 ['block6b_project_conv[0][0]']
lization)
block6b_drop (Dropout) (None, 2, 2, 208) 0 ['block6b_project_bn[0][0]']
block6b_add (Add) (None, 2, 2, 208) 0 ['block6b_drop[0][0]',
'block6a_project_bn[0][0]']
block6c_expand_conv (Conv2D) (None, 2, 2, 1248) 259584 ['block6b_add[0][0]']
block6c_expand_bn (BatchNormal (None, 2, 2, 1248) 4992 ['block6c_expand_conv[0][0]']
ization)
block6c_expand_activation (Act (None, 2, 2, 1248) 0 ['block6c_expand_bn[0][0]']
ivation)
block6c_dwconv2 (DepthwiseConv (None, 2, 2, 1248) 11232 ['block6c_expand_activation[0][0]
2D) ']
block6c_bn (BatchNormalization (None, 2, 2, 1248) 4992 ['block6c_dwconv2[0][0]']
)
block6c_activation (Activation (None, 2, 2, 1248) 0 ['block6c_bn[0][0]']
)
block6c_se_squeeze (GlobalAver (None, 1248) 0 ['block6c_activation[0][0]']
agePooling2D)
block6c_se_reshape (Reshape) (None, 1, 1, 1248) 0 ['block6c_se_squeeze[0][0]']
block6c_se_reduce (Conv2D) (None, 1, 1, 52) 64948 ['block6c_se_reshape[0][0]']
block6c_se_expand (Conv2D) (None, 1, 1, 1248) 66144 ['block6c_se_reduce[0][0]']
block6c_se_excite (Multiply) (None, 2, 2, 1248) 0 ['block6c_activation[0][0]',
'block6c_se_expand[0][0]']
block6c_project_conv (Conv2D) (None, 2, 2, 208) 259584 ['block6c_se_excite[0][0]']
block6c_project_bn (BatchNorma (None, 2, 2, 208) 832 ['block6c_project_conv[0][0]']
lization)
block6c_drop (Dropout) (None, 2, 2, 208) 0 ['block6c_project_bn[0][0]']
block6c_add (Add) (None, 2, 2, 208) 0 ['block6c_drop[0][0]',
'block6b_add[0][0]']
block6d_expand_conv (Conv2D) (None, 2, 2, 1248) 259584 ['block6c_add[0][0]']
block6d_expand_bn (BatchNormal (None, 2, 2, 1248) 4992 ['block6d_expand_conv[0][0]']
ization)
block6d_expand_activation (Act (None, 2, 2, 1248) 0 ['block6d_expand_bn[0][0]']
ivation)
block6d_dwconv2 (DepthwiseConv (None, 2, 2, 1248) 11232 ['block6d_expand_activation[0][0]
2D) ']
block6d_bn (BatchNormalization (None, 2, 2, 1248) 4992 ['block6d_dwconv2[0][0]']
)
block6d_activation (Activation (None, 2, 2, 1248) 0 ['block6d_bn[0][0]']
)
block6d_se_squeeze (GlobalAver (None, 1248) 0 ['block6d_activation[0][0]']
agePooling2D)
block6d_se_reshape (Reshape) (None, 1, 1, 1248) 0 ['block6d_se_squeeze[0][0]']
block6d_se_reduce (Conv2D) (None, 1, 1, 52) 64948 ['block6d_se_reshape[0][0]']
block6d_se_expand (Conv2D) (None, 1, 1, 1248) 66144 ['block6d_se_reduce[0][0]']
block6d_se_excite (Multiply) (None, 2, 2, 1248) 0 ['block6d_activation[0][0]',
'block6d_se_expand[0][0]']
block6d_project_conv (Conv2D) (None, 2, 2, 208) 259584 ['block6d_se_excite[0][0]']
block6d_project_bn (BatchNorma (None, 2, 2, 208) 832 ['block6d_project_conv[0][0]']
lization)
block6d_drop (Dropout) (None, 2, 2, 208) 0 ['block6d_project_bn[0][0]']
block6d_add (Add) (None, 2, 2, 208) 0 ['block6d_drop[0][0]',
'block6c_add[0][0]']
block6e_expand_conv (Conv2D) (None, 2, 2, 1248) 259584 ['block6d_add[0][0]']
block6e_expand_bn (BatchNormal (None, 2, 2, 1248) 4992 ['block6e_expand_conv[0][0]']
ization)
block6e_expand_activation (Act (None, 2, 2, 1248) 0 ['block6e_expand_bn[0][0]']
ivation)
block6e_dwconv2 (DepthwiseConv (None, 2, 2, 1248) 11232 ['block6e_expand_activation[0][0]
2D) ']
block6e_bn (BatchNormalization (None, 2, 2, 1248) 4992 ['block6e_dwconv2[0][0]']
)
block6e_activation (Activation (None, 2, 2, 1248) 0 ['block6e_bn[0][0]']
)
block6e_se_squeeze (GlobalAver (None, 1248) 0 ['block6e_activation[0][0]']
agePooling2D)
block6e_se_reshape (Reshape) (None, 1, 1, 1248) 0 ['block6e_se_squeeze[0][0]']
block6e_se_reduce (Conv2D) (None, 1, 1, 52) 64948 ['block6e_se_reshape[0][0]']
block6e_se_expand (Conv2D) (None, 1, 1, 1248) 66144 ['block6e_se_reduce[0][0]']
block6e_se_excite (Multiply) (None, 2, 2, 1248) 0 ['block6e_activation[0][0]',
'block6e_se_expand[0][0]']
block6e_project_conv (Conv2D) (None, 2, 2, 208) 259584 ['block6e_se_excite[0][0]']
block6e_project_bn (BatchNorma (None, 2, 2, 208) 832 ['block6e_project_conv[0][0]']
lization)
block6e_drop (Dropout) (None, 2, 2, 208) 0 ['block6e_project_bn[0][0]']
block6e_add (Add) (None, 2, 2, 208) 0 ['block6e_drop[0][0]',
'block6d_add[0][0]']
block6f_expand_conv (Conv2D) (None, 2, 2, 1248) 259584 ['block6e_add[0][0]']
block6f_expand_bn (BatchNormal (None, 2, 2, 1248) 4992 ['block6f_expand_conv[0][0]']
ization)
block6f_expand_activation (Act (None, 2, 2, 1248) 0 ['block6f_expand_bn[0][0]']
ivation)
block6f_dwconv2 (DepthwiseConv (None, 2, 2, 1248) 11232 ['block6f_expand_activation[0][0]
2D) ']
block6f_bn (BatchNormalization (None, 2, 2, 1248) 4992 ['block6f_dwconv2[0][0]']
)
block6f_activation (Activation (None, 2, 2, 1248) 0 ['block6f_bn[0][0]']
)
block6f_se_squeeze (GlobalAver (None, 1248) 0 ['block6f_activation[0][0]']
agePooling2D)
block6f_se_reshape (Reshape) (None, 1, 1, 1248) 0 ['block6f_se_squeeze[0][0]']
block6f_se_reduce (Conv2D) (None, 1, 1, 52) 64948 ['block6f_se_reshape[0][0]']
block6f_se_expand (Conv2D) (None, 1, 1, 1248) 66144 ['block6f_se_reduce[0][0]']
block6f_se_excite (Multiply) (None, 2, 2, 1248) 0 ['block6f_activation[0][0]',
'block6f_se_expand[0][0]']
block6f_project_conv (Conv2D) (None, 2, 2, 208) 259584 ['block6f_se_excite[0][0]']
block6f_project_bn (BatchNorma (None, 2, 2, 208) 832 ['block6f_project_conv[0][0]']
lization)
block6f_drop (Dropout) (None, 2, 2, 208) 0 ['block6f_project_bn[0][0]']
block6f_add (Add) (None, 2, 2, 208) 0 ['block6f_drop[0][0]',
'block6e_add[0][0]']
block6g_expand_conv (Conv2D) (None, 2, 2, 1248) 259584 ['block6f_add[0][0]']
block6g_expand_bn (BatchNormal (None, 2, 2, 1248) 4992 ['block6g_expand_conv[0][0]']
ization)
block6g_expand_activation (Act (None, 2, 2, 1248) 0 ['block6g_expand_bn[0][0]']
ivation)
block6g_dwconv2 (DepthwiseConv (None, 2, 2, 1248) 11232 ['block6g_expand_activation[0][0]
2D) ']
block6g_bn (BatchNormalization (None, 2, 2, 1248) 4992 ['block6g_dwconv2[0][0]']
)
block6g_activation (Activation (None, 2, 2, 1248) 0 ['block6g_bn[0][0]']
)
block6g_se_squeeze (GlobalAver (None, 1248) 0 ['block6g_activation[0][0]']
agePooling2D)
block6g_se_reshape (Reshape) (None, 1, 1, 1248) 0 ['block6g_se_squeeze[0][0]']
block6g_se_reduce (Conv2D) (None, 1, 1, 52) 64948 ['block6g_se_reshape[0][0]']
block6g_se_expand (Conv2D) (None, 1, 1, 1248) 66144 ['block6g_se_reduce[0][0]']
block6g_se_excite (Multiply) (None, 2, 2, 1248) 0 ['block6g_activation[0][0]',
'block6g_se_expand[0][0]']
block6g_project_conv (Conv2D) (None, 2, 2, 208) 259584 ['block6g_se_excite[0][0]']
block6g_project_bn (BatchNorma (None, 2, 2, 208) 832 ['block6g_project_conv[0][0]']
lization)
block6g_drop (Dropout) (None, 2, 2, 208) 0 ['block6g_project_bn[0][0]']
block6g_add (Add) (None, 2, 2, 208) 0 ['block6g_drop[0][0]',
'block6f_add[0][0]']
block6h_expand_conv (Conv2D) (None, 2, 2, 1248) 259584 ['block6g_add[0][0]']
block6h_expand_bn (BatchNormal (None, 2, 2, 1248) 4992 ['block6h_expand_conv[0][0]']
ization)
block6h_expand_activation (Act (None, 2, 2, 1248) 0 ['block6h_expand_bn[0][0]']
ivation)
block6h_dwconv2 (DepthwiseConv (None, 2, 2, 1248) 11232 ['block6h_expand_activation[0][0]
2D) ']
block6h_bn (BatchNormalization (None, 2, 2, 1248) 4992 ['block6h_dwconv2[0][0]']
)
block6h_activation (Activation (None, 2, 2, 1248) 0 ['block6h_bn[0][0]']
)
block6h_se_squeeze (GlobalAver (None, 1248) 0 ['block6h_activation[0][0]']
agePooling2D)
block6h_se_reshape (Reshape) (None, 1, 1, 1248) 0 ['block6h_se_squeeze[0][0]']
block6h_se_reduce (Conv2D) (None, 1, 1, 52) 64948 ['block6h_se_reshape[0][0]']
block6h_se_expand (Conv2D) (None, 1, 1, 1248) 66144 ['block6h_se_reduce[0][0]']
block6h_se_excite (Multiply) (None, 2, 2, 1248) 0 ['block6h_activation[0][0]',
'block6h_se_expand[0][0]']
block6h_project_conv (Conv2D) (None, 2, 2, 208) 259584 ['block6h_se_excite[0][0]']
block6h_project_bn (BatchNorma (None, 2, 2, 208) 832 ['block6h_project_conv[0][0]']
lization)
block6h_drop (Dropout) (None, 2, 2, 208) 0 ['block6h_project_bn[0][0]']
block6h_add (Add) (None, 2, 2, 208) 0 ['block6h_drop[0][0]',
'block6g_add[0][0]']
block6i_expand_conv (Conv2D) (None, 2, 2, 1248) 259584 ['block6h_add[0][0]']
block6i_expand_bn (BatchNormal (None, 2, 2, 1248) 4992 ['block6i_expand_conv[0][0]']
ization)
block6i_expand_activation (Act (None, 2, 2, 1248) 0 ['block6i_expand_bn[0][0]']
ivation)
block6i_dwconv2 (DepthwiseConv (None, 2, 2, 1248) 11232 ['block6i_expand_activation[0][0]
2D) ']
block6i_bn (BatchNormalization (None, 2, 2, 1248) 4992 ['block6i_dwconv2[0][0]']
)
block6i_activation (Activation (None, 2, 2, 1248) 0 ['block6i_bn[0][0]']
)
block6i_se_squeeze (GlobalAver (None, 1248) 0 ['block6i_activation[0][0]']
agePooling2D)
block6i_se_reshape (Reshape) (None, 1, 1, 1248) 0 ['block6i_se_squeeze[0][0]']
block6i_se_reduce (Conv2D) (None, 1, 1, 52) 64948 ['block6i_se_reshape[0][0]']
block6i_se_expand (Conv2D) (None, 1, 1, 1248) 66144 ['block6i_se_reduce[0][0]']
block6i_se_excite (Multiply) (None, 2, 2, 1248) 0 ['block6i_activation[0][0]',
'block6i_se_expand[0][0]']
block6i_project_conv (Conv2D) (None, 2, 2, 208) 259584 ['block6i_se_excite[0][0]']
block6i_project_bn (BatchNorma (None, 2, 2, 208) 832 ['block6i_project_conv[0][0]']
lization)
block6i_drop (Dropout) (None, 2, 2, 208) 0 ['block6i_project_bn[0][0]']
block6i_add (Add) (None, 2, 2, 208) 0 ['block6i_drop[0][0]',
'block6h_add[0][0]']
block6j_expand_conv (Conv2D) (None, 2, 2, 1248) 259584 ['block6i_add[0][0]']
block6j_expand_bn (BatchNormal (None, 2, 2, 1248) 4992 ['block6j_expand_conv[0][0]']
ization)
block6j_expand_activation (Act (None, 2, 2, 1248) 0 ['block6j_expand_bn[0][0]']
ivation)
block6j_dwconv2 (DepthwiseConv (None, 2, 2, 1248) 11232 ['block6j_expand_activation[0][0]
2D) ']
block6j_bn (BatchNormalization (None, 2, 2, 1248) 4992 ['block6j_dwconv2[0][0]']
)
block6j_activation (Activation (None, 2, 2, 1248) 0 ['block6j_bn[0][0]']
)
block6j_se_squeeze (GlobalAver (None, 1248) 0 ['block6j_activation[0][0]']
agePooling2D)
block6j_se_reshape (Reshape) (None, 1, 1, 1248) 0 ['block6j_se_squeeze[0][0]']
block6j_se_reduce (Conv2D) (None, 1, 1, 52) 64948 ['block6j_se_reshape[0][0]']
block6j_se_expand (Conv2D) (None, 1, 1, 1248) 66144 ['block6j_se_reduce[0][0]']
block6j_se_excite (Multiply) (None, 2, 2, 1248) 0 ['block6j_activation[0][0]',
'block6j_se_expand[0][0]']
block6j_project_conv (Conv2D) (None, 2, 2, 208) 259584 ['block6j_se_excite[0][0]']
block6j_project_bn (BatchNorma (None, 2, 2, 208) 832 ['block6j_project_conv[0][0]']
lization)
block6j_drop (Dropout) (None, 2, 2, 208) 0 ['block6j_project_bn[0][0]']
block6j_add (Add) (None, 2, 2, 208) 0 ['block6j_drop[0][0]',
'block6i_add[0][0]']
top_conv (Conv2D) (None, 2, 2, 1408) 292864 ['block6j_add[0][0]']
top_bn (BatchNormalization) (None, 2, 2, 1408) 5632 ['top_conv[0][0]']
top_activation (Activation) (None, 2, 2, 1408) 0 ['top_bn[0][0]']
==================================================================================================
Total params: 8,769,374
Trainable params: 8,687,086
Non-trainable params: 82,288
__________________________________________________________________________________________________
We have imported the EfficientNet Model up to layer 'block5f_expand_activation', as this has shown the best performance compared to other layers (discussed below). The EfficientNet layers will be frozen, so the only trainable layers will be those that we add ourselves. After flattening the input from 'block5f_expand_activation', we will add the same architecture we did earlier to the VGG16 and ResNet v2 models, namely 2 dense layers, followed by a Dropout layer, another dense layer, and BatchNormalization. We will end with a softmax classifier.
transfer_layer_EfficientNet = EfficientNet.get_layer('block5f_expand_activation')
EfficientNet.trainable = False
# Flatten the input
x = Flatten()(transfer_layer_EfficientNet.output)
# Dense layers
x = Dense(256, activation='relu')(x)
x = Dense(128, activation='relu')(x)
x = Dropout(0.2)(x)
x = Dense(64, activation='relu')(x)
x = BatchNormalization()(x)
# Classifier
pred = Dense(4, activation='softmax')(x)
# Initialize the model
model_5 = Model(EfficientNet.input, pred)
# Creating a checkpoint which saves model weights from the best epoch
checkpoint = ModelCheckpoint('./model_5.h5', monitor='val_accuracy', verbose=1, save_best_only=True, mode='auto')
# Initiates early stopping if validation loss does not continue to improve
early_stopping = EarlyStopping(monitor = 'val_loss',
min_delta = 0,
patience = 12,
verbose = 1,
restore_best_weights = True)
# Initiates reduced learning rate if validation loss does not continue to improve
reduce_learningrate = ReduceLROnPlateau(monitor = 'val_loss',
factor = 0.2,
patience = 3,
verbose = 1,
min_delta = 0.0001)
callbacks_list = [checkpoint, early_stopping, reduce_learningrate]
# Compiling model with optimizer set to Adam, loss set to categorical_crossentropy, and metrics set to accuracy
model_5.compile(optimizer = Adam(learning_rate = 0.001), loss = 'categorical_crossentropy', metrics = ['accuracy'])
# Fitting model with epochs set to 100
history_5 = model_5.fit(train_set_rgb, validation_data = val_set_rgb, epochs = 100, callbacks = callbacks_list)
Epoch 1/100 472/473 [============================>.] - ETA: 0s - loss: 1.4247 - accuracy: 0.2577 Epoch 1: val_accuracy improved from -inf to 0.22885, saving model to ./model_5.h5 473/473 [==============================] - 34s 65ms/step - loss: 1.4246 - accuracy: 0.2580 - val_loss: 1.3986 - val_accuracy: 0.2289 - lr: 0.0010 Epoch 2/100 473/473 [==============================] - ETA: 0s - loss: 1.3988 - accuracy: 0.2626 Epoch 2: val_accuracy did not improve from 0.22885 473/473 [==============================] - 28s 59ms/step - loss: 1.3988 - accuracy: 0.2626 - val_loss: 1.4146 - val_accuracy: 0.2289 - lr: 0.0010 Epoch 3/100 473/473 [==============================] - ETA: 0s - loss: 1.3956 - accuracy: 0.2587 Epoch 3: val_accuracy improved from 0.22885 to 0.24432, saving model to ./model_5.h5 473/473 [==============================] - 30s 62ms/step - loss: 1.3956 - accuracy: 0.2587 - val_loss: 1.4337 - val_accuracy: 0.2443 - lr: 0.0010 Epoch 4/100 472/473 [============================>.] - ETA: 0s - loss: 1.3905 - accuracy: 0.2629 Epoch 4: val_accuracy did not improve from 0.24432 473/473 [==============================] - 28s 59ms/step - loss: 1.3906 - accuracy: 0.2629 - val_loss: 1.3707 - val_accuracy: 0.2443 - lr: 0.0010 Epoch 5/100 473/473 [==============================] - ETA: 0s - loss: 1.3949 - accuracy: 0.2601 Epoch 5: val_accuracy did not improve from 0.24432 473/473 [==============================] - 29s 61ms/step - loss: 1.3949 - accuracy: 0.2601 - val_loss: 1.3815 - val_accuracy: 0.2289 - lr: 0.0010 Epoch 6/100 472/473 [============================>.] - ETA: 0s - loss: 1.3943 - accuracy: 0.2532 Epoch 6: val_accuracy did not improve from 0.24432 473/473 [==============================] - 27s 57ms/step - loss: 1.3943 - accuracy: 0.2531 - val_loss: 1.4059 - val_accuracy: 0.2289 - lr: 0.0010 Epoch 7/100 473/473 [==============================] - ETA: 0s - loss: 1.3907 - accuracy: 0.2586 Epoch 7: val_accuracy did not improve from 0.24432 Epoch 7: ReduceLROnPlateau reducing learning rate to 0.00020000000949949026. 473/473 [==============================] - 28s 59ms/step - loss: 1.3907 - accuracy: 0.2586 - val_loss: 1.3807 - val_accuracy: 0.2289 - lr: 0.0010 Epoch 8/100 472/473 [============================>.] - ETA: 0s - loss: 1.3847 - accuracy: 0.2581 Epoch 8: val_accuracy did not improve from 0.24432 473/473 [==============================] - 29s 61ms/step - loss: 1.3847 - accuracy: 0.2577 - val_loss: 1.3774 - val_accuracy: 0.2289 - lr: 2.0000e-04 Epoch 9/100 473/473 [==============================] - ETA: 0s - loss: 1.3828 - accuracy: 0.2626 Epoch 9: val_accuracy did not improve from 0.24432 473/473 [==============================] - 28s 59ms/step - loss: 1.3828 - accuracy: 0.2626 - val_loss: 1.3743 - val_accuracy: 0.2289 - lr: 2.0000e-04 Epoch 10/100 473/473 [==============================] - ETA: 0s - loss: 1.3832 - accuracy: 0.2664 Epoch 10: val_accuracy did not improve from 0.24432 Epoch 10: ReduceLROnPlateau reducing learning rate to 4.0000001899898055e-05. 473/473 [==============================] - 29s 60ms/step - loss: 1.3832 - accuracy: 0.2664 - val_loss: 1.3724 - val_accuracy: 0.2289 - lr: 2.0000e-04 Epoch 11/100 472/473 [============================>.] - ETA: 0s - loss: 1.3828 - accuracy: 0.2629 Epoch 11: val_accuracy did not improve from 0.24432 473/473 [==============================] - 28s 60ms/step - loss: 1.3828 - accuracy: 0.2630 - val_loss: 1.3758 - val_accuracy: 0.2289 - lr: 4.0000e-05 Epoch 12/100 473/473 [==============================] - ETA: 0s - loss: 1.3824 - accuracy: 0.2657 Epoch 12: val_accuracy did not improve from 0.24432 473/473 [==============================] - 29s 61ms/step - loss: 1.3824 - accuracy: 0.2657 - val_loss: 1.3776 - val_accuracy: 0.2289 - lr: 4.0000e-05 Epoch 13/100 473/473 [==============================] - ETA: 0s - loss: 1.3820 - accuracy: 0.2657 Epoch 13: val_accuracy did not improve from 0.24432 Epoch 13: ReduceLROnPlateau reducing learning rate to 8.000000525498762e-06. 473/473 [==============================] - 30s 63ms/step - loss: 1.3820 - accuracy: 0.2657 - val_loss: 1.3756 - val_accuracy: 0.2289 - lr: 4.0000e-05 Epoch 14/100 473/473 [==============================] - ETA: 0s - loss: 1.3813 - accuracy: 0.2699 Epoch 14: val_accuracy did not improve from 0.24432 473/473 [==============================] - 27s 56ms/step - loss: 1.3813 - accuracy: 0.2699 - val_loss: 1.3776 - val_accuracy: 0.2289 - lr: 8.0000e-06 Epoch 15/100 473/473 [==============================] - ETA: 0s - loss: 1.3815 - accuracy: 0.2655 Epoch 15: val_accuracy did not improve from 0.24432 473/473 [==============================] - 28s 58ms/step - loss: 1.3815 - accuracy: 0.2655 - val_loss: 1.3778 - val_accuracy: 0.2289 - lr: 8.0000e-06 Epoch 16/100 473/473 [==============================] - ETA: 0s - loss: 1.3817 - accuracy: 0.2652 Epoch 16: val_accuracy did not improve from 0.24432 Restoring model weights from the end of the best epoch: 4. Epoch 16: ReduceLROnPlateau reducing learning rate to 1.6000001778593287e-06. 473/473 [==============================] - 29s 61ms/step - loss: 1.3817 - accuracy: 0.2652 - val_loss: 1.3777 - val_accuracy: 0.2289 - lr: 8.0000e-06 Epoch 16: early stopping
# Plotting the accuracies
plt.figure(figsize = (10, 5))
plt.plot(history_5.history['accuracy'])
plt.plot(history_5.history['val_accuracy'])
plt.title('Accuracy - EfficientNet Model')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Training', 'Validation'], loc='center right')
plt.show()
# Plotting the losses
plt.figure(figsize = (10, 5))
plt.plot(history_5.history['loss'])
plt.plot(history_5.history['val_loss'])
plt.title('Loss - EfficientNet Model')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['Training', 'Validation'], loc='upper right')
plt.show()
# Evaluating the model's performance on the test set
accuracy = model_5.evaluate(test_set_rgb)
4/4 [==============================] - 0s 58ms/step - loss: 1.3913 - accuracy: 0.2500
Observations and Insights:
As imported and modified, this model performs poorly. After just 4 epochs (the 'best' epoch), training accuracy stands at 0.26 and validation accuracy is 0.24. Training and validation accuracy are almost immediately horizontal. Loss declines a bit before leveling off. With test accuracy coming in at 0.25, it makes the model no better than random guessing. We could build a model that classifies every single image as 'happy', and with our evenly distributed test set, it would produce the same 0.25 accuracy as our EfficientNet model.
Again, it was difficult to select a 'best' layer from which to import the EfficientNet model. A history of alternative models is below.
| Train Loss | Train Accuracy | Val Loss | Val Accuracy | |
|---|---|---|---|---|
| EfficientNet block5f_expand_activation (selected) | 1.39 | 0.26 | 1.37 | 0.24 |
| EfficientNet block6e_expand_activation | 1.53 | 0.25 | 1.45 | 0.22 |
| EfficientNet block4a_expand_activation | 1.42 | 0.25 | 1.42 | 0.21 |
| EfficientNet block3c_expand_activation | 1.47 | 0.26 | 1.44 | 0.22 |
Overall Observations and Insights on Transfer Learning Models:
| Parameters | Train Loss | Train Accuracy | Val Loss | Val Accuracy | Test Accuracy | |
|---|---|---|---|---|---|---|
| Model 1.1: Baseline Grayscale | 605,060 | 0.68 | 0.72 | 0.78 | 0.68 | 0.65 |
| Model 1.2: Baseline RGB | 605,572 | 0.68 | 0.72 | 0.78 | 0.68 | 0.63 |
| Model 2.1: 2nd Gen Grayscale | 455,780 | 0.54 | 0.78 | 0.74 | 0.71 | 0.69 |
| Model 2.2: 2nd Gen RGB | 457,828 | 0.59 | 0.76 | 0.72 | 0.71 | 0.68 |
| Model 3: VGG16 | 14,714,688 | 0.71 | 0.72 | 0.80 | 0.67 | 0.66 |
| Model 4: ResNet V2 | 42,658,176 | 1.43 | 0.26 | 1.35 | 0.36 | 0.28 |
| Model 5: EfficientNet | 8,769,374 | 1.39 | 0.26 | 1.37 | 0.24 | 0.25 |
As previewed above, it is time to expand our 2nd generation grayscale model to see if we can improve performance. Grayscale slightly outperformed RGB in our first two models, so we will leave RGB behind and proceed with color_mode set to grayscale.
As we are proceeding with a colormode set to grayscale, we will create new data loaders for our more complex CNN, Model 6. As our data augmentation takes place when we instantiate an ImageDataGenerator object, it is convenient to create data loaders specific to our new model so we can easily finetune our hyperparameters as needed. The ImageDataGenerators below include the parameters of the final Milestone 1 model, the highest performing CNN thus far. They were chosen after exhaustive finetuning of the model, as discussed later.
batch_size = 32
# Creating ImageDataGenerator objects for grayscale colormode
datagen_train_grayscale = ImageDataGenerator(horizontal_flip = True,
rescale = 1./255,
brightness_range = (0.7,1.3),
rotation_range=25)
datagen_validation_grayscale = ImageDataGenerator(horizontal_flip = True,
rescale = 1./255,
brightness_range = (0.7,1.3),
rotation_range=25)
datagen_test_grayscale = ImageDataGenerator(horizontal_flip = True,
rescale = 1./255,
brightness_range = (0.7,1.3),
rotation_range=25)
# Creating train, validation, and test sets for grayscale colormode
print("Grayscale Images")
train_set_grayscale = datagen_train_grayscale.flow_from_directory(dir_train,
target_size = (img_size, img_size),
color_mode = "grayscale",
batch_size = batch_size,
class_mode = 'categorical',
classes = ['happy', 'sad', 'neutral', 'surprise'],
seed = 42,
shuffle = True)
val_set_grayscale = datagen_validation_grayscale.flow_from_directory(dir_validation,
target_size = (img_size, img_size),
color_mode = "grayscale",
batch_size = batch_size,
class_mode = 'categorical',
classes = ['happy', 'sad', 'neutral', 'surprise'],
seed = 42,
shuffle = True)
test_set_grayscale = datagen_test_grayscale.flow_from_directory(dir_test,
target_size = (img_size, img_size),
color_mode = "grayscale",
batch_size = batch_size,
class_mode = 'categorical',
classes = ['happy', 'sad', 'neutral', 'surprise'],
seed = 42,
shuffle = True)
Grayscale Images Found 15109 images belonging to 4 classes. Found 4977 images belonging to 4 classes. Found 128 images belonging to 4 classes.
The structure of the Milestone 1 model (Model 6) is below. Many configurations were tested, and the following architecture led to the best performance.
# Creating a Sequential model
model_6 = Sequential()
# Convolutional Block #1
model_6.add(Conv2D(64, (3, 3), input_shape = (48, 48, 1), activation='relu', padding = 'same'))
model_6.add(BatchNormalization())
model_6.add(LeakyReLU(alpha = 0.1))
model_6.add(MaxPooling2D(2, 2))
model_6.add(GaussianNoise(0.1))
# Convolutional Block #2
model_6.add(Conv2D(128, (3, 3), activation='relu', padding = 'same'))
model_6.add(BatchNormalization())
model_6.add(LeakyReLU(alpha = 0.1))
model_6.add(MaxPooling2D(2, 2))
model_6.add(GaussianNoise(0.1))
# Convolutional Block #3
model_6.add(Conv2D(512, (2, 2), activation='relu', padding = 'same'))
model_6.add(BatchNormalization())
model_6.add(LeakyReLU(alpha = 0.1))
model_6.add(MaxPooling2D(2, 2))
model_6.add(Dropout(0.1))
# Convolutional Block #4
model_6.add(Conv2D(512, (2, 2), activation='relu', padding = 'same'))
model_6.add(BatchNormalization())
model_6.add(LeakyReLU(alpha = 0.1))
model_6.add(MaxPooling2D(2, 2))
model_6.add(GaussianNoise(0.1))
# Convolutional Block #5
model_6.add(Conv2D(256, (2, 2), activation='relu', padding = 'same'))
model_6.add(BatchNormalization())
model_6.add(LeakyReLU(alpha = 0.1))
model_6.add(MaxPooling2D(2, 2))
model_6.add(Dropout(0.1))
# Flatten layer
model_6.add(Flatten())
# Dense layers
model_6.add(Dense(256, activation = 'relu'))
model_6.add(BatchNormalization())
model_6.add(Dropout(0.1))
model_6.add(Dense(512, activation = 'relu'))
model_6.add(BatchNormalization())
model_6.add(Dropout(0.05))
# Classifier
model_6.add(Dense(4, activation = 'softmax'))
model_6.summary()
Metal device set to: Apple M1 Pro
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 48, 48, 64) 640
batch_normalization (BatchN (None, 48, 48, 64) 256
ormalization)
leaky_re_lu (LeakyReLU) (None, 48, 48, 64) 0
max_pooling2d (MaxPooling2D (None, 24, 24, 64) 0
)
gaussian_noise (GaussianNoi (None, 24, 24, 64) 0
se)
conv2d_1 (Conv2D) (None, 24, 24, 128) 73856
batch_normalization_1 (Batc (None, 24, 24, 128) 512
hNormalization)
leaky_re_lu_1 (LeakyReLU) (None, 24, 24, 128) 0
max_pooling2d_1 (MaxPooling (None, 12, 12, 128) 0
2D)
gaussian_noise_1 (GaussianN (None, 12, 12, 128) 0
oise)
conv2d_2 (Conv2D) (None, 12, 12, 512) 262656
batch_normalization_2 (Batc (None, 12, 12, 512) 2048
hNormalization)
leaky_re_lu_2 (LeakyReLU) (None, 12, 12, 512) 0
max_pooling2d_2 (MaxPooling (None, 6, 6, 512) 0
2D)
dropout (Dropout) (None, 6, 6, 512) 0
conv2d_3 (Conv2D) (None, 6, 6, 512) 1049088
batch_normalization_3 (Batc (None, 6, 6, 512) 2048
hNormalization)
leaky_re_lu_3 (LeakyReLU) (None, 6, 6, 512) 0
max_pooling2d_3 (MaxPooling (None, 3, 3, 512) 0
2D)
gaussian_noise_2 (GaussianN (None, 3, 3, 512) 0
oise)
conv2d_4 (Conv2D) (None, 3, 3, 256) 524544
batch_normalization_4 (Batc (None, 3, 3, 256) 1024
hNormalization)
leaky_re_lu_4 (LeakyReLU) (None, 3, 3, 256) 0
max_pooling2d_4 (MaxPooling (None, 1, 1, 256) 0
2D)
dropout_1 (Dropout) (None, 1, 1, 256) 0
flatten (Flatten) (None, 256) 0
dense (Dense) (None, 256) 65792
batch_normalization_5 (Batc (None, 256) 1024
hNormalization)
dropout_2 (Dropout) (None, 256) 0
dense_1 (Dense) (None, 512) 131584
batch_normalization_6 (Batc (None, 512) 2048
hNormalization)
dropout_3 (Dropout) (None, 512) 0
dense_2 (Dense) (None, 4) 2052
=================================================================
Total params: 2,119,172
Trainable params: 2,114,692
Non-trainable params: 4,480
_________________________________________________________________
# Creating a checkpoint which saves model weights from the best epoch
checkpoint = ModelCheckpoint('./model_6.h5', monitor='val_accuracy', verbose=1, save_best_only=True, mode='auto')
# Initiates early stopping if validation loss does not continue to improve
early_stopping = EarlyStopping(monitor = 'val_loss',
min_delta = 0,
patience = 10,
verbose = 1,
restore_best_weights = True)
# Initiates reduced learning rate if validation loss does not continue to improve
reduce_learningrate = ReduceLROnPlateau(monitor = 'val_loss',
factor = 0.2,
patience = 3,
verbose = 1,
min_delta = 0.0001)
callbacks_list = [checkpoint, early_stopping, reduce_learningrate]
# Compiling model with optimizer set to Adam, loss set to categorical_crossentropy, and metrics set to accuracy
model_6.compile(optimizer = 'Adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])
# Fitting model with epochs set to 100
history_6 = model_6.fit(train_set_grayscale, validation_data = val_set_grayscale, epochs = 100, callbacks = callbacks_list)
Epoch 1/100 473/473 [==============================] - ETA: 0s - loss: 1.4412 - accuracy: 0.3567 Epoch 1: val_accuracy improved from -inf to 0.36649, saving model to ./model_6.h5 473/473 [==============================] - 43s 89ms/step - loss: 1.4412 - accuracy: 0.3567 - val_loss: 1.4990 - val_accuracy: 0.3665 - lr: 0.0010 Epoch 2/100 473/473 [==============================] - ETA: 0s - loss: 1.1413 - accuracy: 0.4874 Epoch 2: val_accuracy improved from 0.36649 to 0.52059, saving model to ./model_6.h5 473/473 [==============================] - 42s 89ms/step - loss: 1.1413 - accuracy: 0.4874 - val_loss: 1.0787 - val_accuracy: 0.5206 - lr: 0.0010 Epoch 3/100 473/473 [==============================] - ETA: 0s - loss: 0.9954 - accuracy: 0.5729 Epoch 3: val_accuracy did not improve from 0.52059 473/473 [==============================] - 41s 87ms/step - loss: 0.9954 - accuracy: 0.5729 - val_loss: 1.4114 - val_accuracy: 0.3769 - lr: 0.0010 Epoch 4/100 473/473 [==============================] - ETA: 0s - loss: 0.9205 - accuracy: 0.6166 Epoch 4: val_accuracy improved from 0.52059 to 0.57705, saving model to ./model_6.h5 473/473 [==============================] - 41s 87ms/step - loss: 0.9205 - accuracy: 0.6166 - val_loss: 1.0254 - val_accuracy: 0.5771 - lr: 0.0010 Epoch 5/100 473/473 [==============================] - ETA: 0s - loss: 0.8666 - accuracy: 0.6415 Epoch 5: val_accuracy improved from 0.57705 to 0.60438, saving model to ./model_6.h5 473/473 [==============================] - 1620s 3s/step - loss: 0.8666 - accuracy: 0.6415 - val_loss: 0.9523 - val_accuracy: 0.6044 - lr: 0.0010 Epoch 6/100 473/473 [==============================] - ETA: 0s - loss: 0.8290 - accuracy: 0.6543 Epoch 6: val_accuracy improved from 0.60438 to 0.65803, saving model to ./model_6.h5 473/473 [==============================] - 151s 319ms/step - loss: 0.8290 - accuracy: 0.6543 - val_loss: 0.8368 - val_accuracy: 0.6580 - lr: 0.0010 Epoch 7/100 473/473 [==============================] - ETA: 0s - loss: 0.7970 - accuracy: 0.6700 Epoch 7: val_accuracy did not improve from 0.65803 473/473 [==============================] - 43s 90ms/step - loss: 0.7970 - accuracy: 0.6700 - val_loss: 1.0110 - val_accuracy: 0.5596 - lr: 0.0010 Epoch 8/100 473/473 [==============================] - ETA: 0s - loss: 0.7768 - accuracy: 0.6795 Epoch 8: val_accuracy improved from 0.65803 to 0.69118, saving model to ./model_6.h5 473/473 [==============================] - 42s 89ms/step - loss: 0.7768 - accuracy: 0.6795 - val_loss: 0.8693 - val_accuracy: 0.6912 - lr: 0.0010 Epoch 9/100 473/473 [==============================] - ETA: 0s - loss: 0.7577 - accuracy: 0.6926 Epoch 9: val_accuracy did not improve from 0.69118 Epoch 9: ReduceLROnPlateau reducing learning rate to 0.00020000000949949026. 473/473 [==============================] - 43s 91ms/step - loss: 0.7577 - accuracy: 0.6926 - val_loss: 0.8368 - val_accuracy: 0.6631 - lr: 0.0010 Epoch 10/100 473/473 [==============================] - ETA: 0s - loss: 0.6667 - accuracy: 0.7288 Epoch 10: val_accuracy improved from 0.69118 to 0.73478, saving model to ./model_6.h5 473/473 [==============================] - 42s 90ms/step - loss: 0.6667 - accuracy: 0.7288 - val_loss: 0.6538 - val_accuracy: 0.7348 - lr: 2.0000e-04 Epoch 11/100 473/473 [==============================] - ETA: 0s - loss: 0.6419 - accuracy: 0.7415 Epoch 11: val_accuracy improved from 0.73478 to 0.73599, saving model to ./model_6.h5 473/473 [==============================] - 43s 90ms/step - loss: 0.6419 - accuracy: 0.7415 - val_loss: 0.6677 - val_accuracy: 0.7360 - lr: 2.0000e-04 Epoch 12/100 473/473 [==============================] - ETA: 0s - loss: 0.6263 - accuracy: 0.7505 Epoch 12: val_accuracy did not improve from 0.73599 473/473 [==============================] - 42s 88ms/step - loss: 0.6263 - accuracy: 0.7505 - val_loss: 0.6744 - val_accuracy: 0.7245 - lr: 2.0000e-04 Epoch 13/100 473/473 [==============================] - ETA: 0s - loss: 0.6079 - accuracy: 0.7551 Epoch 13: val_accuracy did not improve from 0.73599 Epoch 13: ReduceLROnPlateau reducing learning rate to 4.0000001899898055e-05. 473/473 [==============================] - 42s 89ms/step - loss: 0.6079 - accuracy: 0.7551 - val_loss: 0.7121 - val_accuracy: 0.7099 - lr: 2.0000e-04 Epoch 14/100 473/473 [==============================] - ETA: 0s - loss: 0.5842 - accuracy: 0.7673 Epoch 14: val_accuracy improved from 0.73599 to 0.75085, saving model to ./model_6.h5 473/473 [==============================] - 42s 89ms/step - loss: 0.5842 - accuracy: 0.7673 - val_loss: 0.6154 - val_accuracy: 0.7509 - lr: 4.0000e-05 Epoch 15/100 473/473 [==============================] - ETA: 0s - loss: 0.5717 - accuracy: 0.7719 Epoch 15: val_accuracy improved from 0.75085 to 0.75186, saving model to ./model_6.h5 473/473 [==============================] - 43s 91ms/step - loss: 0.5717 - accuracy: 0.7719 - val_loss: 0.6172 - val_accuracy: 0.7519 - lr: 4.0000e-05 Epoch 16/100 473/473 [==============================] - ETA: 0s - loss: 0.5682 - accuracy: 0.7764 Epoch 16: val_accuracy improved from 0.75186 to 0.76251, saving model to ./model_6.h5 473/473 [==============================] - 42s 89ms/step - loss: 0.5682 - accuracy: 0.7764 - val_loss: 0.6131 - val_accuracy: 0.7625 - lr: 4.0000e-05 Epoch 17/100 473/473 [==============================] - ETA: 0s - loss: 0.5629 - accuracy: 0.7715 Epoch 17: val_accuracy did not improve from 0.76251 473/473 [==============================] - 44s 94ms/step - loss: 0.5629 - accuracy: 0.7715 - val_loss: 0.6038 - val_accuracy: 0.7597 - lr: 4.0000e-05 Epoch 18/100 473/473 [==============================] - ETA: 0s - loss: 0.5459 - accuracy: 0.7809 Epoch 18: val_accuracy did not improve from 0.76251 473/473 [==============================] - 42s 88ms/step - loss: 0.5459 - accuracy: 0.7809 - val_loss: 0.6133 - val_accuracy: 0.7585 - lr: 4.0000e-05 Epoch 19/100 473/473 [==============================] - ETA: 0s - loss: 0.5516 - accuracy: 0.7811 Epoch 19: val_accuracy did not improve from 0.76251 473/473 [==============================] - 43s 90ms/step - loss: 0.5516 - accuracy: 0.7811 - val_loss: 0.6163 - val_accuracy: 0.7569 - lr: 4.0000e-05 Epoch 20/100 473/473 [==============================] - ETA: 0s - loss: 0.5459 - accuracy: 0.7828 Epoch 20: val_accuracy did not improve from 0.76251 Epoch 20: ReduceLROnPlateau reducing learning rate to 8.000000525498762e-06. 473/473 [==============================] - 42s 89ms/step - loss: 0.5459 - accuracy: 0.7828 - val_loss: 0.6205 - val_accuracy: 0.7545 - lr: 4.0000e-05 Epoch 21/100 473/473 [==============================] - ETA: 0s - loss: 0.5460 - accuracy: 0.7846 Epoch 21: val_accuracy did not improve from 0.76251 473/473 [==============================] - 44s 92ms/step - loss: 0.5460 - accuracy: 0.7846 - val_loss: 0.6054 - val_accuracy: 0.7565 - lr: 8.0000e-06 Epoch 22/100 473/473 [==============================] - ETA: 0s - loss: 0.5419 - accuracy: 0.7840 Epoch 22: val_accuracy improved from 0.76251 to 0.76713, saving model to ./model_6.h5 473/473 [==============================] - 44s 93ms/step - loss: 0.5419 - accuracy: 0.7840 - val_loss: 0.6065 - val_accuracy: 0.7671 - lr: 8.0000e-06 Epoch 23/100 473/473 [==============================] - ETA: 0s - loss: 0.5367 - accuracy: 0.7843 Epoch 23: val_accuracy did not improve from 0.76713 Epoch 23: ReduceLROnPlateau reducing learning rate to 1.6000001778593287e-06. 473/473 [==============================] - 42s 90ms/step - loss: 0.5367 - accuracy: 0.7843 - val_loss: 0.6070 - val_accuracy: 0.7565 - lr: 8.0000e-06 Epoch 24/100 473/473 [==============================] - ETA: 0s - loss: 0.5346 - accuracy: 0.7888 Epoch 24: val_accuracy did not improve from 0.76713 473/473 [==============================] - 42s 89ms/step - loss: 0.5346 - accuracy: 0.7888 - val_loss: 0.6090 - val_accuracy: 0.7569 - lr: 1.6000e-06 Epoch 25/100 473/473 [==============================] - ETA: 0s - loss: 0.5327 - accuracy: 0.7844 Epoch 25: val_accuracy did not improve from 0.76713 473/473 [==============================] - 42s 89ms/step - loss: 0.5327 - accuracy: 0.7844 - val_loss: 0.6116 - val_accuracy: 0.7593 - lr: 1.6000e-06 Epoch 26/100 473/473 [==============================] - ETA: 0s - loss: 0.5433 - accuracy: 0.7871 Epoch 26: val_accuracy did not improve from 0.76713 Epoch 26: ReduceLROnPlateau reducing learning rate to 3.200000264769187e-07. 473/473 [==============================] - 42s 89ms/step - loss: 0.5433 - accuracy: 0.7871 - val_loss: 0.6168 - val_accuracy: 0.7557 - lr: 1.6000e-06 Epoch 27/100 473/473 [==============================] - ETA: 0s - loss: 0.5275 - accuracy: 0.7897 Epoch 27: val_accuracy did not improve from 0.76713 473/473 [==============================] - 41s 87ms/step - loss: 0.5275 - accuracy: 0.7897 - val_loss: 0.6019 - val_accuracy: 0.7585 - lr: 3.2000e-07 Epoch 28/100 473/473 [==============================] - ETA: 0s - loss: 0.5311 - accuracy: 0.7903 Epoch 28: val_accuracy did not improve from 0.76713 473/473 [==============================] - 49s 104ms/step - loss: 0.5311 - accuracy: 0.7903 - val_loss: 0.6096 - val_accuracy: 0.7587 - lr: 3.2000e-07 Epoch 29/100 473/473 [==============================] - ETA: 0s - loss: 0.5337 - accuracy: 0.7893 Epoch 29: val_accuracy did not improve from 0.76713 473/473 [==============================] - 43s 91ms/step - loss: 0.5337 - accuracy: 0.7893 - val_loss: 0.6000 - val_accuracy: 0.7625 - lr: 3.2000e-07 Epoch 30/100 473/473 [==============================] - ETA: 0s - loss: 0.5320 - accuracy: 0.7903 Epoch 30: val_accuracy did not improve from 0.76713 473/473 [==============================] - 41s 86ms/step - loss: 0.5320 - accuracy: 0.7903 - val_loss: 0.6133 - val_accuracy: 0.7583 - lr: 3.2000e-07 Epoch 31/100 473/473 [==============================] - ETA: 0s - loss: 0.5377 - accuracy: 0.7856 Epoch 31: val_accuracy did not improve from 0.76713 473/473 [==============================] - 42s 89ms/step - loss: 0.5377 - accuracy: 0.7856 - val_loss: 0.6099 - val_accuracy: 0.7611 - lr: 3.2000e-07 Epoch 32/100 473/473 [==============================] - ETA: 0s - loss: 0.5352 - accuracy: 0.7913 Epoch 32: val_accuracy did not improve from 0.76713 Epoch 32: ReduceLROnPlateau reducing learning rate to 6.400000529538374e-08. 473/473 [==============================] - 41s 87ms/step - loss: 0.5352 - accuracy: 0.7913 - val_loss: 0.6081 - val_accuracy: 0.7601 - lr: 3.2000e-07 Epoch 33/100 473/473 [==============================] - ETA: 0s - loss: 0.5360 - accuracy: 0.7874 Epoch 33: val_accuracy did not improve from 0.76713 473/473 [==============================] - 41s 87ms/step - loss: 0.5360 - accuracy: 0.7874 - val_loss: 0.5998 - val_accuracy: 0.7639 - lr: 6.4000e-08 Epoch 34/100 473/473 [==============================] - ETA: 0s - loss: 0.5304 - accuracy: 0.7885 Epoch 34: val_accuracy did not improve from 0.76713 473/473 [==============================] - 42s 89ms/step - loss: 0.5304 - accuracy: 0.7885 - val_loss: 0.6073 - val_accuracy: 0.7563 - lr: 6.4000e-08 Epoch 35/100 473/473 [==============================] - ETA: 0s - loss: 0.5398 - accuracy: 0.7830 Epoch 35: val_accuracy did not improve from 0.76713 473/473 [==============================] - 42s 88ms/step - loss: 0.5398 - accuracy: 0.7830 - val_loss: 0.6072 - val_accuracy: 0.7609 - lr: 6.4000e-08 Epoch 36/100 473/473 [==============================] - ETA: 0s - loss: 0.5339 - accuracy: 0.7940 Epoch 36: val_accuracy did not improve from 0.76713 Epoch 36: ReduceLROnPlateau reducing learning rate to 1.2800001059076749e-08. 473/473 [==============================] - 42s 89ms/step - loss: 0.5339 - accuracy: 0.7940 - val_loss: 0.6040 - val_accuracy: 0.7613 - lr: 6.4000e-08 Epoch 37/100 473/473 [==============================] - ETA: 0s - loss: 0.5418 - accuracy: 0.7862 Epoch 37: val_accuracy did not improve from 0.76713 473/473 [==============================] - 41s 86ms/step - loss: 0.5418 - accuracy: 0.7862 - val_loss: 0.6112 - val_accuracy: 0.7577 - lr: 1.2800e-08 Epoch 38/100 473/473 [==============================] - ETA: 0s - loss: 0.5349 - accuracy: 0.7855 Epoch 38: val_accuracy did not improve from 0.76713 473/473 [==============================] - 44s 92ms/step - loss: 0.5349 - accuracy: 0.7855 - val_loss: 0.6158 - val_accuracy: 0.7595 - lr: 1.2800e-08 Epoch 39/100 473/473 [==============================] - ETA: 0s - loss: 0.5322 - accuracy: 0.7909 Epoch 39: val_accuracy did not improve from 0.76713 Epoch 39: ReduceLROnPlateau reducing learning rate to 2.5600002118153498e-09. 473/473 [==============================] - 42s 89ms/step - loss: 0.5322 - accuracy: 0.7909 - val_loss: 0.6091 - val_accuracy: 0.7593 - lr: 1.2800e-08 Epoch 40/100 473/473 [==============================] - ETA: 0s - loss: 0.5314 - accuracy: 0.7896 Epoch 40: val_accuracy did not improve from 0.76713 473/473 [==============================] - 42s 88ms/step - loss: 0.5314 - accuracy: 0.7896 - val_loss: 0.6054 - val_accuracy: 0.7583 - lr: 2.5600e-09 Epoch 41/100 473/473 [==============================] - ETA: 0s - loss: 0.5287 - accuracy: 0.7948 Epoch 41: val_accuracy did not improve from 0.76713 473/473 [==============================] - 44s 93ms/step - loss: 0.5287 - accuracy: 0.7948 - val_loss: 0.6059 - val_accuracy: 0.7577 - lr: 2.5600e-09 Epoch 42/100 473/473 [==============================] - ETA: 0s - loss: 0.5367 - accuracy: 0.7877 Epoch 42: val_accuracy did not improve from 0.76713 Epoch 42: ReduceLROnPlateau reducing learning rate to 5.1200004236307e-10. 473/473 [==============================] - 43s 90ms/step - loss: 0.5367 - accuracy: 0.7877 - val_loss: 0.6013 - val_accuracy: 0.7627 - lr: 2.5600e-09 Epoch 43/100 473/473 [==============================] - ETA: 0s - loss: 0.5267 - accuracy: 0.7914 Epoch 43: val_accuracy did not improve from 0.76713 Restoring model weights from the end of the best epoch: 33. 473/473 [==============================] - 42s 89ms/step - loss: 0.5267 - accuracy: 0.7914 - val_loss: 0.6077 - val_accuracy: 0.7619 - lr: 5.1200e-10 Epoch 43: early stopping
# Plotting the accuracies
plt.figure(figsize = (10, 5))
plt.plot(history_6.history['accuracy'])
plt.plot(history_6.history['val_accuracy'])
plt.title('Accuracy - Complex Model')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Training', 'Validation'], loc='lower right')
plt.show()
# Plotting the losses
plt.figure(figsize = (10, 5))
plt.plot(history_6.history['loss'])
plt.plot(history_6.history['val_loss'])
plt.title('Loss - Complex Model')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['Training', 'Validation'], loc='upper right')
plt.show()
# Evaluating the model's performance on the test set
accuracy = model_6.evaluate(test_set_grayscale)
4/4 [==============================] - 0s 36ms/step - loss: 0.5633 - accuracy: 0.7578
Observations and Insights:
Model 6, our Milestone 1 model, outperforms all previous models. After 33 epochs (best epoch), training accuracy stands at 0.79 and validation accuracy is 0.76. Accuracy and loss for both training and validation data improve similarly before leveling off. The model begins to overfit around epoch 15, but the overfitting is not as severe as previous models. The test accuracy for this model is 0.76. Overall, Model 6 generalizes better than previous models, and is the top performer thus far. That said, it is still an overfitting model, and thus it would not be advisable to deploy this model as is.
This model underwent numerous transformations before arriving at its final state. Parameters were tuned, layers were added, layers were removed, and eventually the above model was determined to be the best iteration. An abridged history of model development can be found in the table below.
The starting point for our final model was as follows:
CONVOLUTIONAL BLOCK #1
CONVOLUTIONAL BLOCK #2
CONVOLUTIONAL BLOCK #3
CONVOLUTIONAL BLOCK #4
CONVOLUTIONAL BLOCK #5
FINAL LAYERS
PARAMETERS
Below is an abridged summary of actions taken to improve the model. In many cases, parameters or layers were adjusted, added, or removed, just to be returned to their original state when the desired or experimental impact was not realized. The model went through dozens of iterations, with the following transformations being the most impactful.
| Action Taken | Train Loss | Train Accuracy | Val Loss | Val Accuracy |
|---|---|---|---|---|
| Starting model as outlined above | 0.77 | 0.70 | 0.89 | 0.58 |
| Dropout(0.1) layers added to conv blocks 1 and 5 to reduce overfitting | 0.75 | 0.74 | 0.66 | 0.61 |
| Shear_range removed entirely to determine effect | 0.76 | 0.74 | 0.68 | 0.60 |
| Rotation_range added and optimized | 0.74 | 0.74 | 0.62 | 0.61 |
| Additional dropout layers added to blocks 2 and 4 | 0.59 | 0.78 | 0.64 | 0.68 |
| Number of neurons in final dense layer set to 512 | 0.68 | 0.71 | 0.62 | 0.71 |
| Number of neurons in block 4 increased to 512 | 0.70 | 0.73 | 0.60 | 0.74 |
| Dropout layers swapped out for GaussianNoise in blocks 1 and 2 | 0.61 | 0.74 | 0.57 | 0.75 |
| Brightness_range narrowed to (0.5,1.5) then to (0.7,1.3) | 0.59 | 0.75 | 0.60 | 0.75 |
| Kernel size enlarged to 3x3 in first then also second block | 0.55 | 0.78 | 0.57 | 0.75 |
| Dropout in block 5 reduced to 0.5, resulting in final model | 0.54 | 0.79 | 0.60 | 0.76 |
While Model 6 was an improvement on previous models, it was still overfitting the training data. In order to feel comfortable recommending a model for deployment in the context of this project, we need to improve on Model 6. Model 7 is an attempt to develop a deployable CNN. We want our model to have high accuracy, while also maintaining a good fit (no overfitting/underfitting) and generalizing well to the unseen test data. We will continue with color_mode set to grayscale for the reasons already noted: slightly better performance, slightly fewer parameters, slightly lower computational expense, and the image data itself is already grayscale.
We will once again be creating new data loaders for Model 7. As mentioned earlier, since our data augmentation takes place when we instantiate an ImageDataGenerator object, it is convenient to create data loaders specific to our new model so we can easily finetune our hyperparameters as needed. The ImageDataGenerators below include the parameters of our final, highest performing iteration of the model. They were once again chosen after exhaustive finetuning, as discussed later.
batch_size = 128
# Creating ImageDataGenerator objects for grayscale colormode
datagen_train_grayscale = ImageDataGenerator(rescale=1./255,
brightness_range=(0.0,2.0),
horizontal_flip=True,
shear_range=0.3)
datagen_validation_grayscale = ImageDataGenerator(rescale=1./255,
brightness_range=(0.0,2.0),
horizontal_flip=True,
shear_range=0.3)
datagen_test_grayscale = ImageDataGenerator(rescale=1./255,
brightness_range=(0.0,2.0),
horizontal_flip=True,
shear_range=0.3)
# Creating train, validation, and test sets for grayscale colormode
print("Grayscale Images")
train_set_grayscale = datagen_train_grayscale.flow_from_directory(dir_train,
target_size = (img_size, img_size),
color_mode = "grayscale",
batch_size = batch_size,
class_mode = 'categorical',
classes = ['happy', 'sad', 'neutral', 'surprise'],
seed = 42,
shuffle = True)
val_set_grayscale = datagen_validation_grayscale.flow_from_directory(dir_validation,
target_size = (img_size, img_size),
color_mode = "grayscale",
batch_size = batch_size,
class_mode = 'categorical',
classes = ['happy', 'sad', 'neutral', 'surprise'],
seed = 42,
shuffle = False)
test_set_grayscale = datagen_test_grayscale.flow_from_directory(dir_test,
target_size = (img_size, img_size),
color_mode = "grayscale",
batch_size = batch_size,
class_mode = 'categorical',
classes = ['happy', 'sad', 'neutral', 'surprise'],
seed = 42,
shuffle = False)
Grayscale Images Found 15109 images belonging to 4 classes. Found 4977 images belonging to 4 classes. Found 128 images belonging to 4 classes.
The structure of Model 7 is below. Rather than simply modifying Model 6, the development of Model 7 entailed going back to the drawing board and devising a new strategy. Many configurations were tested, and the following architecture led to the best, most generalizable performance.
# Creating a Sequential model
model_7 = Sequential()
# Convolutional Block #1
model_7.add(Conv2D(64, (3, 3), input_shape = (48, 48, 1), activation = 'relu'))
model_7.add(BatchNormalization())
model_7.add(Conv2D(64, (3, 3), activation = 'relu'))
model_7.add(MaxPooling2D(pool_size=(2, 2), strides=(2,2)))
model_7.add(Dropout(0.4))
# Convolutional Block #2
model_7.add(BatchNormalization())
model_7.add(Conv2D(128, (3, 3), activation='relu'))
model_7.add(BatchNormalization())
model_7.add(Conv2D(128, (3, 3), activation='relu'))
model_7.add(MaxPooling2D(pool_size = (2, 2), strides=(2,2)))
model_7.add(Dropout(0.4))
# Convolutional Block #3
model_7.add(BatchNormalization())
model_7.add(Conv2D(128, (3, 3), activation='relu'))
model_7.add(BatchNormalization())
model_7.add(Conv2D(128, (3, 3), activation='relu'))
model_7.add(MaxPooling2D(pool_size = (2, 2), strides=(2,2)))
model_7.add(Dropout(0.4))
# SECRET LEVEL
model_7.add(BatchNormalization())
model_7.add(Conv2D(128, (2, 2), kernel_regularizer = l2(0.025)))
model_7.add(BatchNormalization())
# Flatten layer
model_7.add(Flatten())
# Dense layers
model_7.add(Dense(1024, activation = 'relu'))
model_7.add(Dropout(0.2))
model_7.add(GaussianNoise(0.1))
model_7.add(Dense(1024, activation = 'relu'))
model_7.add(Dropout(0.2))
# Classifier
model_7.add(Dense(4, activation = 'softmax'))
model_7.summary()
Metal device set to: Apple M1 Pro
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 46, 46, 64) 640
batch_normalization (BatchN (None, 46, 46, 64) 256
ormalization)
conv2d_1 (Conv2D) (None, 44, 44, 64) 36928
max_pooling2d (MaxPooling2D (None, 22, 22, 64) 0
)
dropout (Dropout) (None, 22, 22, 64) 0
batch_normalization_1 (Batc (None, 22, 22, 64) 256
hNormalization)
conv2d_2 (Conv2D) (None, 20, 20, 128) 73856
batch_normalization_2 (Batc (None, 20, 20, 128) 512
hNormalization)
conv2d_3 (Conv2D) (None, 18, 18, 128) 147584
max_pooling2d_1 (MaxPooling (None, 9, 9, 128) 0
2D)
dropout_1 (Dropout) (None, 9, 9, 128) 0
batch_normalization_3 (Batc (None, 9, 9, 128) 512
hNormalization)
conv2d_4 (Conv2D) (None, 7, 7, 128) 147584
batch_normalization_4 (Batc (None, 7, 7, 128) 512
hNormalization)
conv2d_5 (Conv2D) (None, 5, 5, 128) 147584
max_pooling2d_2 (MaxPooling (None, 2, 2, 128) 0
2D)
dropout_2 (Dropout) (None, 2, 2, 128) 0
batch_normalization_5 (Batc (None, 2, 2, 128) 512
hNormalization)
conv2d_6 (Conv2D) (None, 1, 1, 128) 65664
batch_normalization_6 (Batc (None, 1, 1, 128) 512
hNormalization)
flatten (Flatten) (None, 128) 0
dense (Dense) (None, 1024) 132096
dropout_3 (Dropout) (None, 1024) 0
gaussian_noise (GaussianNoi (None, 1024) 0
se)
dense_1 (Dense) (None, 1024) 1049600
dropout_4 (Dropout) (None, 1024) 0
dense_2 (Dense) (None, 4) 4100
=================================================================
Total params: 1,808,708
Trainable params: 1,807,172
Non-trainable params: 1,536
_________________________________________________________________
# Creating a checkpoint which saves model weights from the best epoch
checkpoint = ModelCheckpoint('./model_7.h5', monitor='val_accuracy', verbose=1, save_best_only=True, mode='auto')
# Initiates early stopping if validation loss does not continue to improve
early_stopping = EarlyStopping(monitor = 'val_loss',
min_delta = 0,
patience = 5,
verbose = 1,
restore_best_weights = True)
# Slows the learning rate when validation loss does not improve
reduce_learningrate = ReduceLROnPlateau(monitor = 'val_loss',
factor = 0.2,
patience = 2,
verbose = 1,
min_delta = 0.0001)
callbacks_list = [checkpoint, early_stopping, reduce_learningrate]
Note:
# Compiling model with optimizer set to Adam, loss set to categorical_crossentropy, and metrics set to accuracy
model_7.compile(optimizer = 'Adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])
# Fitting model with epochs set to 200
history_7 = model_7.fit(train_set_grayscale, validation_data = val_set_grayscale, epochs = 200, callbacks = callbacks_list)
Epoch 1/200 119/119 [==============================] - ETA: 0s - loss: 2.4585 - accuracy: 0.2759 Epoch 1: val_accuracy improved from -inf to 0.19510, saving model to ./model_7.h5 119/119 [==============================] - 21s 164ms/step - loss: 2.4585 - accuracy: 0.2759 - val_loss: 1.5317 - val_accuracy: 0.1951 - lr: 0.0010 Epoch 2/200 119/119 [==============================] - ETA: 0s - loss: 1.3727 - accuracy: 0.3706 Epoch 2: val_accuracy improved from 0.19510 to 0.28752, saving model to ./model_7.h5 119/119 [==============================] - 18s 149ms/step - loss: 1.3727 - accuracy: 0.3706 - val_loss: 1.4891 - val_accuracy: 0.2875 - lr: 0.0010 Epoch 3/200 119/119 [==============================] - ETA: 0s - loss: 1.2177 - accuracy: 0.4810 Epoch 3: val_accuracy improved from 0.28752 to 0.32811, saving model to ./model_7.h5 119/119 [==============================] - 18s 147ms/step - loss: 1.2177 - accuracy: 0.4810 - val_loss: 1.3786 - val_accuracy: 0.3281 - lr: 0.0010 Epoch 4/200 119/119 [==============================] - ETA: 0s - loss: 1.0825 - accuracy: 0.5517 Epoch 4: val_accuracy improved from 0.32811 to 0.36186, saving model to ./model_7.h5 119/119 [==============================] - 18s 155ms/step - loss: 1.0825 - accuracy: 0.5517 - val_loss: 1.3412 - val_accuracy: 0.3619 - lr: 0.0010 Epoch 5/200 119/119 [==============================] - ETA: 0s - loss: 1.0186 - accuracy: 0.5914 Epoch 5: val_accuracy improved from 0.36186 to 0.47820, saving model to ./model_7.h5 119/119 [==============================] - 18s 154ms/step - loss: 1.0186 - accuracy: 0.5914 - val_loss: 1.1709 - val_accuracy: 0.4782 - lr: 0.0010 Epoch 6/200 119/119 [==============================] - ETA: 0s - loss: 0.9598 - accuracy: 0.6138 Epoch 6: val_accuracy improved from 0.47820 to 0.63432, saving model to ./model_7.h5 119/119 [==============================] - 17s 146ms/step - loss: 0.9598 - accuracy: 0.6138 - val_loss: 0.9410 - val_accuracy: 0.6343 - lr: 0.0010 Epoch 7/200 119/119 [==============================] - ETA: 0s - loss: 0.9179 - accuracy: 0.6325 Epoch 7: val_accuracy did not improve from 0.63432 119/119 [==============================] - 18s 155ms/step - loss: 0.9179 - accuracy: 0.6325 - val_loss: 1.0329 - val_accuracy: 0.6014 - lr: 0.0010 Epoch 8/200 119/119 [==============================] - ETA: 0s - loss: 0.9175 - accuracy: 0.6464 Epoch 8: val_accuracy improved from 0.63432 to 0.63954, saving model to ./model_7.h5 119/119 [==============================] - 18s 147ms/step - loss: 0.9175 - accuracy: 0.6464 - val_loss: 0.8771 - val_accuracy: 0.6395 - lr: 0.0010 Epoch 9/200 119/119 [==============================] - ETA: 0s - loss: 0.8740 - accuracy: 0.6580 Epoch 9: val_accuracy improved from 0.63954 to 0.68535, saving model to ./model_7.h5 119/119 [==============================] - 17s 146ms/step - loss: 0.8740 - accuracy: 0.6580 - val_loss: 0.8390 - val_accuracy: 0.6854 - lr: 0.0010 Epoch 10/200 119/119 [==============================] - ETA: 0s - loss: 0.8452 - accuracy: 0.6654 Epoch 10: val_accuracy did not improve from 0.68535 119/119 [==============================] - 17s 143ms/step - loss: 0.8452 - accuracy: 0.6654 - val_loss: 0.9580 - val_accuracy: 0.6201 - lr: 0.0010 Epoch 11/200 119/119 [==============================] - ETA: 0s - loss: 0.8622 - accuracy: 0.6655 Epoch 11: val_accuracy improved from 0.68535 to 0.70123, saving model to ./model_7.h5 119/119 [==============================] - 18s 152ms/step - loss: 0.8622 - accuracy: 0.6655 - val_loss: 0.7876 - val_accuracy: 0.7012 - lr: 0.0010 Epoch 12/200 119/119 [==============================] - ETA: 0s - loss: 0.8811 - accuracy: 0.6699 Epoch 12: val_accuracy did not improve from 0.70123 119/119 [==============================] - 17s 146ms/step - loss: 0.8811 - accuracy: 0.6699 - val_loss: 0.8592 - val_accuracy: 0.6821 - lr: 0.0010 Epoch 13/200 119/119 [==============================] - ETA: 0s - loss: 0.8345 - accuracy: 0.6801 Epoch 13: val_accuracy did not improve from 0.70123 Epoch 13: ReduceLROnPlateau reducing learning rate to 0.00020000000949949026. 119/119 [==============================] - 18s 153ms/step - loss: 0.8345 - accuracy: 0.6801 - val_loss: 0.7922 - val_accuracy: 0.6864 - lr: 0.0010 Epoch 14/200 119/119 [==============================] - ETA: 0s - loss: 0.7443 - accuracy: 0.7130 Epoch 14: val_accuracy improved from 0.70123 to 0.73237, saving model to ./model_7.h5 119/119 [==============================] - 18s 151ms/step - loss: 0.7443 - accuracy: 0.7130 - val_loss: 0.7052 - val_accuracy: 0.7324 - lr: 2.0000e-04 Epoch 15/200 119/119 [==============================] - ETA: 0s - loss: 0.7103 - accuracy: 0.7206 Epoch 15: val_accuracy improved from 0.73237 to 0.73719, saving model to ./model_7.h5 119/119 [==============================] - 18s 147ms/step - loss: 0.7103 - accuracy: 0.7206 - val_loss: 0.6913 - val_accuracy: 0.7372 - lr: 2.0000e-04 Epoch 16/200 119/119 [==============================] - ETA: 0s - loss: 0.7012 - accuracy: 0.7239 Epoch 16: val_accuracy did not improve from 0.73719 119/119 [==============================] - 17s 146ms/step - loss: 0.7012 - accuracy: 0.7239 - val_loss: 0.6851 - val_accuracy: 0.7316 - lr: 2.0000e-04 Epoch 17/200 119/119 [==============================] - ETA: 0s - loss: 0.7024 - accuracy: 0.7243 Epoch 17: val_accuracy did not improve from 0.73719 119/119 [==============================] - 17s 145ms/step - loss: 0.7024 - accuracy: 0.7243 - val_loss: 0.6888 - val_accuracy: 0.7296 - lr: 2.0000e-04 Epoch 18/200 119/119 [==============================] - ETA: 0s - loss: 0.6947 - accuracy: 0.7277 Epoch 18: val_accuracy did not improve from 0.73719 Epoch 18: ReduceLROnPlateau reducing learning rate to 4.0000001899898055e-05. 119/119 [==============================] - 18s 147ms/step - loss: 0.6947 - accuracy: 0.7277 - val_loss: 0.6891 - val_accuracy: 0.7314 - lr: 2.0000e-04 Epoch 19/200 119/119 [==============================] - ETA: 0s - loss: 0.6722 - accuracy: 0.7329 Epoch 19: val_accuracy improved from 0.73719 to 0.74242, saving model to ./model_7.h5 119/119 [==============================] - 18s 153ms/step - loss: 0.6722 - accuracy: 0.7329 - val_loss: 0.6699 - val_accuracy: 0.7424 - lr: 4.0000e-05 Epoch 20/200 119/119 [==============================] - ETA: 0s - loss: 0.6663 - accuracy: 0.7346 Epoch 20: val_accuracy improved from 0.74242 to 0.74342, saving model to ./model_7.h5 119/119 [==============================] - 18s 148ms/step - loss: 0.6663 - accuracy: 0.7346 - val_loss: 0.6622 - val_accuracy: 0.7434 - lr: 4.0000e-05 Epoch 21/200 119/119 [==============================] - ETA: 0s - loss: 0.6665 - accuracy: 0.7331 Epoch 21: val_accuracy did not improve from 0.74342 119/119 [==============================] - 18s 149ms/step - loss: 0.6665 - accuracy: 0.7331 - val_loss: 0.6600 - val_accuracy: 0.7416 - lr: 4.0000e-05 Epoch 22/200 119/119 [==============================] - ETA: 0s - loss: 0.6624 - accuracy: 0.7349 Epoch 22: val_accuracy did not improve from 0.74342 119/119 [==============================] - 17s 146ms/step - loss: 0.6624 - accuracy: 0.7349 - val_loss: 0.6577 - val_accuracy: 0.7434 - lr: 4.0000e-05 Epoch 23/200 119/119 [==============================] - ETA: 0s - loss: 0.6498 - accuracy: 0.7408 Epoch 23: val_accuracy improved from 0.74342 to 0.74623, saving model to ./model_7.h5 119/119 [==============================] - 17s 145ms/step - loss: 0.6498 - accuracy: 0.7408 - val_loss: 0.6563 - val_accuracy: 0.7462 - lr: 4.0000e-05 Epoch 24/200 119/119 [==============================] - ETA: 0s - loss: 0.6526 - accuracy: 0.7377 Epoch 24: val_accuracy did not improve from 0.74623 119/119 [==============================] - 17s 145ms/step - loss: 0.6526 - accuracy: 0.7377 - val_loss: 0.6579 - val_accuracy: 0.7412 - lr: 4.0000e-05 Epoch 25/200 119/119 [==============================] - ETA: 0s - loss: 0.6451 - accuracy: 0.7409 Epoch 25: val_accuracy improved from 0.74623 to 0.74784, saving model to ./model_7.h5 119/119 [==============================] - 18s 155ms/step - loss: 0.6451 - accuracy: 0.7409 - val_loss: 0.6534 - val_accuracy: 0.7478 - lr: 4.0000e-05 Epoch 26/200 119/119 [==============================] - ETA: 0s - loss: 0.6475 - accuracy: 0.7390 Epoch 26: val_accuracy did not improve from 0.74784 119/119 [==============================] - 18s 152ms/step - loss: 0.6475 - accuracy: 0.7390 - val_loss: 0.6450 - val_accuracy: 0.7436 - lr: 4.0000e-05 Epoch 27/200 119/119 [==============================] - ETA: 0s - loss: 0.6451 - accuracy: 0.7389 Epoch 27: val_accuracy improved from 0.74784 to 0.74844, saving model to ./model_7.h5 119/119 [==============================] - 17s 143ms/step - loss: 0.6451 - accuracy: 0.7389 - val_loss: 0.6431 - val_accuracy: 0.7484 - lr: 4.0000e-05 Epoch 28/200 119/119 [==============================] - ETA: 0s - loss: 0.6431 - accuracy: 0.7427 Epoch 28: val_accuracy did not improve from 0.74844 119/119 [==============================] - 18s 147ms/step - loss: 0.6431 - accuracy: 0.7427 - val_loss: 0.6518 - val_accuracy: 0.7412 - lr: 4.0000e-05 Epoch 29/200 119/119 [==============================] - ETA: 0s - loss: 0.6350 - accuracy: 0.7465 Epoch 29: val_accuracy did not improve from 0.74844 Epoch 29: ReduceLROnPlateau reducing learning rate to 8.000000525498762e-06. 119/119 [==============================] - 17s 146ms/step - loss: 0.6350 - accuracy: 0.7465 - val_loss: 0.6473 - val_accuracy: 0.7458 - lr: 4.0000e-05 Epoch 30/200 119/119 [==============================] - ETA: 0s - loss: 0.6359 - accuracy: 0.7452 Epoch 30: val_accuracy did not improve from 0.74844 119/119 [==============================] - 20s 166ms/step - loss: 0.6359 - accuracy: 0.7452 - val_loss: 0.6513 - val_accuracy: 0.7422 - lr: 8.0000e-06 Epoch 31/200 119/119 [==============================] - ETA: 0s - loss: 0.6340 - accuracy: 0.7468 Epoch 31: val_accuracy did not improve from 0.74844 Epoch 31: ReduceLROnPlateau reducing learning rate to 1.6000001778593287e-06. 119/119 [==============================] - 20s 166ms/step - loss: 0.6340 - accuracy: 0.7468 - val_loss: 0.6469 - val_accuracy: 0.7450 - lr: 8.0000e-06 Epoch 32/200 119/119 [==============================] - ETA: 0s - loss: 0.6317 - accuracy: 0.7463 Epoch 32: val_accuracy did not improve from 0.74844 119/119 [==============================] - 19s 158ms/step - loss: 0.6317 - accuracy: 0.7463 - val_loss: 0.6409 - val_accuracy: 0.7454 - lr: 1.6000e-06 Epoch 33/200 119/119 [==============================] - ETA: 0s - loss: 0.6375 - accuracy: 0.7435 Epoch 33: val_accuracy did not improve from 0.74844 119/119 [==============================] - 16s 139ms/step - loss: 0.6375 - accuracy: 0.7435 - val_loss: 0.6499 - val_accuracy: 0.7436 - lr: 1.6000e-06 Epoch 34/200 119/119 [==============================] - ETA: 0s - loss: 0.6296 - accuracy: 0.7457 Epoch 34: val_accuracy did not improve from 0.74844 119/119 [==============================] - 18s 148ms/step - loss: 0.6296 - accuracy: 0.7457 - val_loss: 0.6377 - val_accuracy: 0.7472 - lr: 1.6000e-06 Epoch 35/200 119/119 [==============================] - ETA: 0s - loss: 0.6268 - accuracy: 0.7464 Epoch 35: val_accuracy did not improve from 0.74844 119/119 [==============================] - 17s 145ms/step - loss: 0.6268 - accuracy: 0.7464 - val_loss: 0.6451 - val_accuracy: 0.7422 - lr: 1.6000e-06 Epoch 36/200 119/119 [==============================] - ETA: 0s - loss: 0.6312 - accuracy: 0.7454 Epoch 36: val_accuracy did not improve from 0.74844 Epoch 36: ReduceLROnPlateau reducing learning rate to 3.200000264769187e-07. 119/119 [==============================] - 17s 143ms/step - loss: 0.6312 - accuracy: 0.7454 - val_loss: 0.6426 - val_accuracy: 0.7464 - lr: 1.6000e-06 Epoch 37/200 119/119 [==============================] - ETA: 0s - loss: 0.6247 - accuracy: 0.7494 Epoch 37: val_accuracy did not improve from 0.74844 119/119 [==============================] - 17s 142ms/step - loss: 0.6247 - accuracy: 0.7494 - val_loss: 0.6382 - val_accuracy: 0.7426 - lr: 3.2000e-07 Epoch 38/200 119/119 [==============================] - ETA: 0s - loss: 0.6301 - accuracy: 0.7474 Epoch 38: val_accuracy improved from 0.74844 to 0.74945, saving model to ./model_7.h5 Epoch 38: ReduceLROnPlateau reducing learning rate to 6.400000529538374e-08. 119/119 [==============================] - 17s 144ms/step - loss: 0.6301 - accuracy: 0.7474 - val_loss: 0.6422 - val_accuracy: 0.7494 - lr: 3.2000e-07 Epoch 39/200 119/119 [==============================] - ETA: 0s - loss: 0.6324 - accuracy: 0.7408 Epoch 39: val_accuracy did not improve from 0.74945 Restoring model weights from the end of the best epoch: 34. 119/119 [==============================] - 17s 141ms/step - loss: 0.6324 - accuracy: 0.7408 - val_loss: 0.6408 - val_accuracy: 0.7490 - lr: 6.4000e-08 Epoch 39: early stopping
# Plotting the accuracies
plt.figure(figsize = (10, 5))
plt.plot(history_7.history['accuracy'])
plt.plot(history_7.history['val_accuracy'])
plt.title('Accuracy - Final Model')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Training', 'Validation'], loc='lower right')
plt.show()
# Plotting the losses
plt.figure(figsize = (10, 5))
plt.plot(history_7.history['loss'])
plt.plot(history_7.history['val_loss'])
plt.title('Loss - Final Model')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Training', 'Validation'], loc='upper right')
plt.show()
# Evaluating the model's performance on the test set
accuracy = model_7.evaluate(test_set_grayscale)
1/1 [==============================] - 0s 359ms/step - loss: 0.6198 - accuracy: 0.7500
Observations and Insights:
Model 7, rewarding us for all of our efforts, displays the best all-around performance. Accuracies for training, validation, and test data are stable at 0.75, while loss is stable across training, validation, and test data at roughly 0.63 (0.62 to 0.64). As evidenced by the above graphs, there is no noticeable generalization gap. The accuracy and loss curves move more or less in tandem, leveling off around epoch 25 and remaining together from that point forward. The model does not overfit or underfit the training data. The images below show the accuracy and loss curves for the same model run out to 115 epochs. The model converges at reasonable levels of accuracy and loss, and it generalizes well.
|
|
Much like Model 6, this model underwent numerous transformations before arriving at its final state. Parameters were tuned, layers were added, others were removed, and in the end, the above iteration of the model was determined to be the best. Below are the impacts that some of the most important aspects of the model have on its overall performance. While some individual metrics may be better than those of the final model, each of the modifications below, if taken individually or in tandem, results in a generalization gap that is not present in the final model.
| Model Changes | Train Loss | Train Accuracy | Val Loss | Val Accuracy |
|---|---|---|---|---|
| Final Model | 0.63 | 0.75 | 0.64 | 0.75 |
| Remove "regularization" block | 0.63 | 0.76 | 0.68 | 0.73 |
| Remove L2 kernel regularizer | 0.62 | 0.74 | 0.64 | 0.73 |
| Remove Gaussian Noise | 0.65 | 0.73 | 0.66 | 0.74 |
| Reduce kernel size to (2,2) | 0.63 | 0.74 | 0.66 | 0.74 |
| Dropout levels reduced to 0.2 | 0.57 | 0.78 | 0.65 | 0.74 |
| Remove BatchNormalization | 0.74 | 0.70 | 0.69 | 0.72 |
| Include relu activation in regularization block | 0.63 | 0.74 | 0.63 | 0.74 |
| Batch size = 32 | 0.62 | 0.75 | 0.65 | 0.74 |
| Data augmentation with rotation range = 20 | 0.69 | 0.72 | 0.67 | 0.74 |
| Data augmentation with zoom range = 0.2 | 0.71 | 0.71 | 0.69 | 0.73 |
| Vertical flip = True | 0.74 | 0.71 | 0.70 | 0.74 |
| Only 1 convolutional layer per block | 0.84 | 0.66 | 0.78 | 0.70 |
Below are the accuracy and loss scores for each of our models, first in a tabular format, then represented visually in the form of bar charts.
| Parameters | Train Loss | Train Accuracy | Val Loss | Val Accuracy | Test Loss | Test Accuracy | |
|---|---|---|---|---|---|---|---|
| Model 1.1: Baseline Grayscale | 605,060 | 0.68 | 0.72 | 0.78 | 0.68 | 0.82 | 0.65 |
| Model 1.2: Baseline RGB | 605,572 | 0.68 | 0.72 | 0.78 | 0.68 | 0.80 | 0.63 |
| Model 2.1: 2nd Gen Grayscale | 455,780 | 0.54 | 0.78 | 0.74 | 0.71 | 0.81 | 0.69 |
| Model 2.2: 2nd Gen RGB | 457,828 | 0.59 | 0.76 | 0.72 | 0.71 | 0.70 | 0.68 |
| Model 3: VGG16 | 14,714,688 | 0.71 | 0.72 | 0.80 | 0.67 | 0.74 | 0.66 |
| Model 4: ResNet V2 | 42,658,176 | 1.43 | 0.26 | 1.35 | 0.36 | 1.40 | 0.28 |
| Model 5: EfficientNet | 8,769,374 | 1.39 | 0.26 | 1.37 | 0.24 | 1.40 | 0.25 |
| Model 6: Milestone 1 | 2,119,172 | 0.54 | 0.79 | 0.60 | 0.76 | 0.56 | 0.76 |
| Model 7: Final Model | 1,808,708 | 0.63 | 0.75 | 0.64 | 0.75 | 0.62 | 0.75 |
# creating a dictionary containing model accuracies
dict_model_acc = {
"Model": ["1.1", "1.2", "2.1", "2.2", "3", "4", "5", "6", "7"],
"Train": [0.72, 0.72, 0.78, 0.76, 0.72, 0.26, 0.26, 0.79, 0.75],
"Validate": [0.68, 0.68, 0.71, 0.71, 0.67, 0.36, 0.24, 0.76, 0.75],
"Test": [0.65, 0.63, 0.69, 0.68, 0.66, 0.28, 0.25, 0.76, 0.75]}
# converting dictionary to dataframe
df_model_acc = pd.DataFrame.from_dict(dict_model_acc)
# plotting accuracy scores for all models
df_model_acc.groupby("Model", sort=False).mean().plot(kind='bar', figsize=(10,5),
title="Accuracy Scores Across Models",
ylabel="Accuracy Score", xlabel="Models", rot=0, fontsize=12, width=0.9, colormap="Pastel2",
edgecolor='black')
plt.legend(loc=(.59, 0.77))
plt.show()
# creating a dictionary containing model loss
dict_model_loss = {
"Model": ["1.1", "1.2", "2.1", "2.2", "3", "4", "5", "6", "7"],
"Train": [0.68, 0.68, 0.54, 0.59, 0.71, 1.43, 1.39, 0.54, 0.63],
"Validate": [0.78, 0.78, 0.74, 0.72, 0.80, 1.35, 1.37, 0.60, 0.64],
"Test": [0.82, 0.80, 0.81, 0.70, 0.74, 1.40, 1.40, 0.56, 0.62]}
# converting dictionary to dataframe
df_model_loss = pd.DataFrame.from_dict(dict_model_loss)
# plotting loss scores for all models
df_model_loss.groupby("Model", sort=False).mean().plot(kind='bar', figsize=(10,5),
title="Loss Scores Across Models",
ylabel="Loss Score", xlabel="Models", rot=0, fontsize=12, width=0.9, colormap="Pastel2",
edgecolor='black')
plt.show()
Observations and Insights:
The above graphs perfectly depict the overfitting that occurs in Models 1.1, 1.2, 2.1, 2.2, and 3, with accuracy scores declining in steps as we move from training, to validation, and on to test data. The opposite is true for the loss scores. The graphs also show the total dysfunction of Models 4 and 5, with very low accuracy and very high error scores. It is also clear from the graphs that Models 6 and 7 are the most consistent, most generalizable models, and that a final decision regarding a deployable model should be made between those two options.
In deciding between Models 6 and 7, it is useful to revisit the accuracy and loss curves for the two models.
|
|
|
|
While the accuracy and loss curves for the two models both stabilize by epoch 20-25, there is no gap between accuracy and loss curves for Model 7, while a slight gap does exist for Model 6. The accuracy and loss scores are all individually better for Model 6 (higher accuracy and lower loss), but when viewed together, the spread within the two scores is larger for Model 6, while it is virtually nonexistent in Model 7. It is difficult to justify deploying a slightly overfitting model when a slightly less accurate but more generalizable model is available. Model 7 will be our final model.
test_set = datagen_test_grayscale.flow_from_directory(dir_test,
target_size = (img_size, img_size),
color_mode = "grayscale",
batch_size = 128,
class_mode = 'categorical',
classes = ['happy', 'sad', 'neutral', 'surprise'],
seed = 42,
shuffle = False)
test_images, test_labels = next(test_set)
pred = model_7.predict(test_images)
pred = np.argmax(pred, axis = 1)
y_true = np.argmax(test_labels, axis = 1)
# Printing the classification report
print(classification_report(y_true, pred))
# Plotting the heatmap using the confusion matrix
cm = confusion_matrix(y_true, pred)
plt.figure(figsize = (8, 5))
sns.heatmap(cm, annot = True, fmt = '.0f', xticklabels = ['happy', 'sad', 'neutral', 'surprise'], yticklabels = ['happy', 'sad', 'neutral', 'surprise'])
plt.ylabel('Actual')
plt.xlabel('Predicted')
plt.show()
Found 128 images belonging to 4 classes.
4/4 [==============================] - 0s 16ms/step
precision recall f1-score support
0 0.79 0.84 0.82 32
1 0.67 0.62 0.65 32
2 0.62 0.72 0.67 32
3 0.96 0.81 0.88 32
accuracy 0.75 128
macro avg 0.76 0.75 0.75 128
weighted avg 0.76 0.75 0.75 128
Observations and Insights:
# Making predictions on the test data
y_pred_test = model_7.predict(test_set)
# Converting probabilities to class labels
y_pred_test_classes = np.argmax(y_pred_test, axis = 1)
# Calculating the probability of the predicted class
y_pred_test_max_probas = np.max(y_pred_test, axis = 1)
classes = ['happy', 'sad', 'neutral', 'surprise']
rows = 3
cols = 4
fig = plt.figure(figsize = (12, 12))
for i in range(cols):
for j in range(rows):
random_index = np.random.randint(0, len(test_labels)) # generating random integer
ax = fig.add_subplot(rows, cols, i * rows + j + 1)
ax.imshow(test_images[random_index, :]) # selecting random test image
pred_label = classes[y_pred_test_classes[random_index]] # predicted label of selected image
pred_proba = y_pred_test_max_probas[random_index] # probability associated with model's prediction
true_label = test_labels[random_index] # actual class label of selected image
if true_label[0] == 1: # converting array to class labels
true_label = "happy"
elif true_label[1] == 1:
true_label = "sad"
elif true_label[2] == 1:
true_label = "neutral"
else:
true_label = "surprise"
ax.set_title("actual: {}\npredicted: {}\nprobability: {:.3}\n".format(
true_label, pred_label, pred_proba))
plt.gray()
plt.show()
1/1 [==============================] - 0s 47ms/step
Observations and Insights:
Over the course of this project, we have thoroughly explored the ins and outs of the given data through visualization and analysis, developed 9 different convolutional neural networks, and drawn many insights from our observations along the way. Though much has already been discussed as we have gone along, a summary of the problem, our findings, and recommendations for implementation can be found below.
As noted at the outset of this project, someone's facial expression can be a powerful window into their true feelings, and as such, can be used as a highly-effective proxy for sentiment. Emotion AI (affective computing) attempts to leverage this proxy by detecting and processing facial expression, through neural networks, in an effort to successfully interpret human emotion and respond appropriately. Developing models that can accurately detect facial emotion is therefore an important driver of advancement in the realm of artificial intelligence and emotionally intelligent machines.
The objective of this project was to utilize deep learning techniques to create a computer vision model that can accurately detect and interpret facial emotions. This model should be capable of performing multi-class classification on images containing one of four facial expressions: happy, sad, neutral, and surprise. As discussed earlier, convolutional neural networks are currently the most effective algorithmic tool available for processing images, so our solution takes the form of a CNN.
Over the course of this project, 9 CNNs were developed (with colormode variations RGB and grayscale). Before model development, the data was visually analyzed and then augmented based on that analysis, the specifics of which depended on the individual model being developed. Models ranged from simple, baseline models to much more complex architectures, including transfer learning models. Ultimately, our final model was chosen for its relatively high accuracy (compared to the other models) and, more importantly, because it is highly generalizable. A tabular and graphical summary of model performance is below.
| Parameters | Train Loss | Train Accuracy | Val Loss | Val Accuracy | Test Loss | Test Accuracy | |
|---|---|---|---|---|---|---|---|
| Model 1.1: Baseline Grayscale | 605,060 | 0.68 | 0.72 | 0.78 | 0.68 | 0.82 | 0.65 |
| Model 1.2: Baseline RGB | 605,572 | 0.68 | 0.72 | 0.78 | 0.68 | 0.80 | 0.63 |
| Model 2.1: 2nd Gen Grayscale | 455,780 | 0.54 | 0.78 | 0.74 | 0.71 | 0.81 | 0.69 |
| Model 2.2: 2nd Gen RGB | 457,828 | 0.59 | 0.76 | 0.72 | 0.71 | 0.70 | 0.68 |
| Model 3: VGG16 | 14,714,688 | 0.71 | 0.72 | 0.80 | 0.67 | 0.74 | 0.66 |
| Model 4: ResNet V2 | 42,658,176 | 1.43 | 0.26 | 1.35 | 0.36 | 1.40 | 0.28 |
| Model 5: EfficientNet | 8,769,374 | 1.39 | 0.26 | 1.37 | 0.24 | 1.40 | 0.25 |
| Model 6: Milestone 1 | 2,119,172 | 0.54 | 0.79 | 0.60 | 0.76 | 0.56 | 0.76 |
| Model 7: Final Model | 1,808,708 | 0.63 | 0.75 | 0.64 | 0.75 | 0.62 | 0.75 |
The architecture for our final model (Model 7) is more complex than our baseline models, but not nearly as complex as the VGG16, ResNet, or EfficientNet transfer learning models that were developed. Model 7 consists of three fairly standard convolutional blocks with relu activation, BatchNormalization, MaxPooling, and Dropout layers. The critical block that conquered overfitting and removed the generalization gap was a regularization block consisting of BatchNormalization layers and a convolutional layer with L2 regularization. Two additional key features of Model 7 are heavy usage of BatchNormalization throughout the architecture, as well as the addition of GaussianNoise between the two fully-connected layers.
The combination of the above features delivered a model with training, validation, and test accuracy of 0.75. While 75% accuracy may not seem particularly high, correctly classifying the FER2013 dataset, from which it appears our data was drawn, is extremely challenging, with human-level accuracy standing at just ±65%. So Model 7 may be more accurate at classifying our dataset than a human, but whether or not 75% accuracy is ultimately high enough for deployment depends entirely on the business use and the cost that would be incurred in any efforts made to improve model performance.
For example, if this computer vision model is being developed to create photo filters for a phone application, perhaps an accuracy of 0.75 is sufficient. It is better than random guessing (0.25) and also better than a human being (0.65). As the stakes in this instance are pretty low, 75% accuracy would likely suffice for model deployment, particularly if 75% accuracy is higher than that of similar phone applications on the market. If, on the other hand, this computer vision model is being developed for use in some sort of life or death medical situation, 75% accuracy may be too low, and improvement might justify the additional expenses incurred.
The spectrum of possible use cases for Emotion AI in general, and facial emotion recognition technology in particular, is so broad that it is difficult to give a general set of recommendations for implementation. It very much comes down to the specific use case for each business, organization, or government.
The first big question to answer is the following: will the collection of this private, emotional data require consent (or opting in) from the individual whose facial emotions are being recorded and analyzed? If consent is required and granted, that makes it easier from a business perspective, as long as the consent given by the individual was based on truthful, transparent terms, and the business lives up to its end of the agreement. If, however, a computer vision model will be used to extract data from individuals without their consent or knowledge, that puts a business, organization, or government in a much more vulnerable position, with huge potential for a privacy rights related backlash and consequent loss of reputation, brand loyalty, market share, legitimacy, etc.
Model 7, with an accuracy of 75 percent, should be considered deployable in some circumstances under certain conditions, and should absolutely not be considered deployable in other circumstances. For example, if a company is analyzing someone's facial reaction to an advertisement (with their permission) in an effort to better target future advertisement campaigns or decide what customer demographic should receive a particular coupon in the mail, than 75 percent accuracy (again, with permission) is perfectly reasonable. If, on the other hand, the intention is to deploy this computer vision model in a situation that can materially impact someone's life in a serious way (denying a loan, denying a job, deciding guilt or innocence in a court of law, student performance in school, etc.), than 75 percent accuracy is nowhere near what it would need to be. On top of that, we should be giving serious thought as to whether or not even the most accurate computer vision model should be deployed in those situations anyways.
For the sake of this exercise, let us assume that a business is interested in our computer vision model to better understand how their advertising campaigns are perceived by current and potential customers. Some key recommendations would be:
Assuming the above to be true, stakeholder actionables could include:
Associated costs include:
The upside to deploying Emotion AI technology like our computer vision model is huge:
Key risks and challenges include several issues already discussed:
Potential further action: